Serverless Scaling: Deploying Strands + MCP on AWS
Written by Om-Shree-0709 on .
- 1. Introduction
- 2. Deployment Options Overview
- 3. Native AWS Lambda (Stateless MCP)
- 4. Lambda + Web Adapter (Containerized MCP)
- 5. AWS Fargate (Containerized MCP)
- 6. Choosing the Right Model
- 7. Key Considerations
- 8. Next Steps
- References
In this Article, we'll explore how to deploy a Strands Agent connected to an MCP server using serverless AWS services. We'll cover three deployment models—Lambda (native & web adapter) and Fargate—and compare their pros, limitations, and recommended scenarios.
1. Introduction
Strands Agents SDK provides a convenient model-driven loop, while MCP enables dynamic tool invocation. Deploying them on AWS serverless platforms allows you to build scalable, maintainable agents without managing servers1.
2. Deployment Options Overview
Option | Benefits | Limitations |
---|---|---|
AWS Lambda (Native) | Fast startup, easy CI/CD, unified observability | Max 15-minute execution, no streaming support2 |
Lambda with Web Adapter | Preserve web frameworks, serverless pay-per-use | Slower cold start (1–3 s), added complexity3 |
AWS Fargate (ECS/EKS) | Long-running containers, streaming support | Higher cost, container lifecycle management4 |
3. Native AWS Lambda (Stateless MCP)
Approach: Package your MCP server as a Lambda function using FastMCP with HTTP transport3.
How to Deploy:
Optionally, expose it via API Gateway:
Benefits:
- Fast cold starts
- Simplified deployment for stateless tools
- Integrated with AWS native monitoring
Limitations:
- No streaming support
- 15-minute execution timeout
- No persistent state between invocations
4. Lambda + Web Adapter (Containerized MCP)
Approach: Package MCP within a web framework (FastAPI, Flask, or Express) inside a Lambda Web Adapter container. This enables web-like behavior within Lambda.
Dockerfile:
app.py Example:
Deploy via AWS CDK Example:
Benefits:
- Allows existing web frameworks
- Flexible HTTP routing via API Gateway
- Serverless, pay-per-use
Limitations:
- Added container and adapter complexity
- Cold start delays (1–3 seconds)
- Still no native streaming support
5. AWS Fargate (Containerized MCP)
Approach: Fully containerize the MCP server and deploy on AWS Fargate via ECS or EKS. Suitable for agents requiring persistent sessions and streaming2.
Dockerfile:
mcp_server.py Example:
CDK Deployment Example:
Benefits:
- Full streaming and persistent workloads supported
- Scalability with ECS or EKS
- Suitable for production-grade deployments
Limitations:
- More costly than Lambda for low-usage patterns
- Slightly longer deploy cycles
- Requires container orchestration setup
6. Choosing the Right Model
- Use Native Lambda for testing, short-lived tasks, low traffic.
- Add Web Adapter when integrating with web apps or frameworks.
- Choose Fargate for streaming, persistent workloads, or higher performance needs43.
7. Key Considerations
- Security & Observability: Lambda and Fargate integrate with X-Ray, CloudWatch, IAM, and OpenTelemetry23.
- Cost & Scaling: Lambda is cost-effective for burst workloads; Fargate favors steady or stream-heavy usage4.
- Developer Experience: Native Lambda offers fastest dev loop; Fargate supports production parity and long-lived workflows3.
8. Next Steps
- Start with a proof-of-concept using native Lambda + FastMCP.
- Expand to include frameworks via Web Adapter for structured web API support.
- Move to a containerized MCP + agent deployment on Fargate via Strands’ sample projects1.
References
Footnotes
-
AWS “Open Protocols for Agent Interoperability Part 3: Strands Agents & MCP” ↩ ↩2
-
Heeki Park, “Building an MCP server as an API developer” ↩ ↩2 ↩3
-
Ran Isenberg, “Serverless MCP on AWS: Lambda vs. Fargate for Agentic AI Workloads” ↩ ↩2 ↩3 ↩4 ↩5
-
Vivek V, “Implementing Nova Act MCP Server on ECS Fargate” ↩ ↩2 ↩3
Written by Om-Shree-0709 (@Om-Shree-0709)