Scaling AI Agents on AWS: Deploying Strands SDK with MCP using Lambda and Fargate

In this Article, we'll explore how to deploy a Strands Agent connected to an MCP server using serverless AWS services. We'll cover three deployment models—Lambda (native & web adapter) and Fargate—and compare their pros, limitations, and recommended scenarios.

1. Introduction

Strands Agents SDK provides a convenient model-driven loop, while MCP enables dynamic tool invocation. Deploying them on AWS serverless platforms allows you to build scalable, maintainable agents without managing servers¹.

2. Deployment Options Overview

Option	Benefits	Limitations
AWS Lambda (Native)	Fast startup, easy CI/CD, unified observability	Max 15-minute execution, no streaming support ²
Lambda with Web Adapter	Preserve web frameworks, serverless pay-per-use	Slower cold start (1–3 s), added complexity ³
AWS Fargate (ECS/EKS)	Long-running containers, streaming support	Higher cost, container lifecycle management ⁴

3. Native AWS Lambda (Stateless MCP)

Approach: Package your MCP server as a Lambda function using FastMCP with HTTP transport³.

# lambda_mcp.py from mcp.server.fastmcp import FastMCP mcp = FastMCP("lambda-mcp", stateless_http=True) @mcp.tool() def echo(message: str) -> str: return message def lambda_handler(event, context): return mcp.handle_lambda_event(event, context)

How to Deploy:

zip function.zip lambda_mcp.py aws lambda create-function \ --function-name lambdaMcp \ --runtime python3.9 \ --handler lambda_mcp.lambda_handler \ --zip-file fileb://function.zip \ --role <LAMBDA_IAM_ROLE_ARN> \ --timeout 900

Optionally, expose it via API Gateway:

aws apigateway create-rest-api --name mcpAPI # Configure /mcp POST integration with the Lambda function

Benefits:

Fast cold starts
Simplified deployment for stateless tools
Integrated with AWS native monitoring

Limitations:

No streaming support
15-minute execution timeout
No persistent state between invocations

4. Lambda + Web Adapter (Containerized MCP)

Approach: Package MCP within a web framework (FastAPI, Flask, or Express) inside a Lambda Web Adapter container. This enables web-like behavior within Lambda.

Dockerfile:

FROM public.ecr.aws/lambda/python:3.9 COPY app.py requirements.txt ./ RUN pip install -r requirements.txt CMD ["app.lambda_handler"]

app.py Example:

from fastmcp import FastMCP from aws_lambda_adapter import api_gateway_handler mcp = FastMCP("web-mcp", stateless_http=True) app = mcp.app def lambda_handler(event, context): return api_gateway_handler(app, event, context)

Deploy via AWS CDK Example:

from aws_cdk import ( aws_lambda as _lambda, aws_apigateway as apigw, Stack ) from constructs import Construct class WebAdapterStack(Stack): def __init__(self, scope, id, **kwargs): super().__init__(scope, id, **kwargs) fn = _lambda.DockerImageFunction(self, "WebMCPFn", code=_lambda.DockerImageCode.from_image_asset("path/to/dockerfile") ) apigw.LambdaRestApi(self, "ApiGateway", handler=fn)

Benefits:

Allows existing web frameworks
Flexible HTTP routing via API Gateway
Serverless, pay-per-use

Limitations:

Added container and adapter complexity
Cold start delays (1–3 seconds)
Still no native streaming support

5. AWS Fargate (Containerized MCP)

Approach: Fully containerize the MCP server and deploy on AWS Fargate via ECS or EKS. Suitable for agents requiring persistent sessions and streaming².

Dockerfile:

FROM python:3.9-slim WORKDIR /app COPY requirements.txt ./ RUN pip install -r requirements.txt COPY mcp_server.py ./ CMD ["python", "mcp_server.py"]

mcp_server.py Example:

from mcp.server.fastmcp import FastMCP mcp = FastMCP("fargate-mcp", stateless_http=True, port=8080) @mcp.tool() def echo(message: str) -> str: return message if __name__ == "__main__": mcp.run(transport="streamable-http")

CDK Deployment Example:

from aws_cdk import ( aws_ecs as ecs, aws_ecs_patterns as patterns, aws_ecr_assets as assets, Stack ) from constructs import Construct class FargateStack(Stack): def __init__(self, scope, id, **kwargs): super().__init__(scope, id, **kwargs) docker_image = assets.DockerImageAsset(self, "McpImage", directory="path/to/dockerfile" ) patterns.ApplicationLoadBalancedFargateService( self, "FargateMCPService", task_image_options={ "image": ecs.ContainerImage.from_docker_image_asset(docker_image) }, desired_count=2, public_load_balancer=True )

Benefits:

Full streaming and persistent workloads supported
Scalability with ECS or EKS
Suitable for production-grade deployments

Limitations:

More costly than Lambda for low-usage patterns
Slightly longer deploy cycles
Requires container orchestration setup

6. Choosing the Right Model

Use Native Lambda for testing, short-lived tasks, low traffic.
Add Web Adapter when integrating with web apps or frameworks.
Choose Fargate for streaming, persistent workloads, or higher performance needs⁴³.

7. Key Considerations

Security & Observability: Lambda and Fargate integrate with X-Ray, CloudWatch, IAM, and OpenTelemetry²³.
Cost & Scaling: Lambda is cost-effective for burst workloads; Fargate favors steady or stream-heavy usage⁴.
Developer Experience: Native Lambda offers fastest dev loop; Fargate supports production parity and long-lived workflows³.

8. Next Steps

Start with a proof-of-concept using native Lambda + FastMCP.
Expand to include frameworks via Web Adapter for structured web API support.
Move to a containerized MCP + agent deployment on Fargate via Strands’ sample projects¹.

Serverless Scaling: Deploying Strands + MCP on AWS

1. Introduction

2. Deployment Options Overview

3. Native AWS Lambda (Stateless MCP)

4. Lambda + Web Adapter (Containerized MCP)

5. AWS Fargate (Containerized MCP)

6. Choosing the Right Model

7. Key Considerations

8. Next Steps

References