Skip to main content
Glama

Serverless Scaling: Deploying Strands + MCP on AWS

Written by on .

AWS
Strands Agents SDK
serverless
Fargate
mcp
ECS
EKS
Agentic Ai

  1. 1. Introduction
    1. 2. Deployment Options Overview
      1. 3. Native AWS Lambda (Stateless MCP)
        1. 4. Lambda + Web Adapter (Containerized MCP)
          1. 5. AWS Fargate (Containerized MCP)
            1. 6. Choosing the Right Model
              1. 7. Key Considerations
                1. 8. Next Steps
                  1. References

                    In this Article, we'll explore how to deploy a Strands Agent connected to an MCP server using serverless AWS services. We'll cover three deployment models—Lambda (native & web adapter) and Fargate—and compare their pros, limitations, and recommended scenarios.

                    1. Introduction

                    Strands Agents SDK provides a convenient model-driven loop, while MCP enables dynamic tool invocation. Deploying them on AWS serverless platforms allows you to build scalable, maintainable agents without managing servers1.

                    2. Deployment Options Overview

                    OptionBenefitsLimitations
                    AWS Lambda (Native)Fast startup, easy CI/CD, unified observabilityMax 15-minute execution, no streaming support2
                    Lambda with Web AdapterPreserve web frameworks, serverless pay-per-useSlower cold start (1–3 s), added complexity3
                    AWS Fargate (ECS/EKS)Long-running containers, streaming supportHigher cost, container lifecycle management4

                    3. Native AWS Lambda (Stateless MCP)

                    Approach: Package your MCP server as a Lambda function using FastMCP with HTTP transport3.

                    # lambda_mcp.py from mcp.server.fastmcp import FastMCP mcp = FastMCP("lambda-mcp", stateless_http=True) @mcp.tool() def echo(message: str) -> str: return message def lambda_handler(event, context): return mcp.handle_lambda_event(event, context)

                    How to Deploy:

                    zip function.zip lambda_mcp.py aws lambda create-function \ --function-name lambdaMcp \ --runtime python3.9 \ --handler lambda_mcp.lambda_handler \ --zip-file fileb://function.zip \ --role <LAMBDA_IAM_ROLE_ARN> \ --timeout 900

                    Optionally, expose it via API Gateway:

                    aws apigateway create-rest-api --name mcpAPI # Configure /mcp POST integration with the Lambda function

                    Benefits:

                    • Fast cold starts
                    • Simplified deployment for stateless tools
                    • Integrated with AWS native monitoring

                    Limitations:

                    • No streaming support
                    • 15-minute execution timeout
                    • No persistent state between invocations

                    4. Lambda + Web Adapter (Containerized MCP)

                    Approach: Package MCP within a web framework (FastAPI, Flask, or Express) inside a Lambda Web Adapter container. This enables web-like behavior within Lambda.

                    Dockerfile:

                    FROM public.ecr.aws/lambda/python:3.9 COPY app.py requirements.txt ./ RUN pip install -r requirements.txt CMD ["app.lambda_handler"]

                    app.py Example:

                    from fastmcp import FastMCP from aws_lambda_adapter import api_gateway_handler mcp = FastMCP("web-mcp", stateless_http=True) app = mcp.app def lambda_handler(event, context): return api_gateway_handler(app, event, context)

                    Deploy via AWS CDK Example:

                    from aws_cdk import ( aws_lambda as _lambda, aws_apigateway as apigw, Stack ) from constructs import Construct class WebAdapterStack(Stack): def __init__(self, scope, id, **kwargs): super().__init__(scope, id, **kwargs) fn = _lambda.DockerImageFunction(self, "WebMCPFn", code=_lambda.DockerImageCode.from_image_asset("path/to/dockerfile") ) apigw.LambdaRestApi(self, "ApiGateway", handler=fn)

                    Benefits:

                    • Allows existing web frameworks
                    • Flexible HTTP routing via API Gateway
                    • Serverless, pay-per-use

                    Limitations:

                    • Added container and adapter complexity
                    • Cold start delays (1–3 seconds)
                    • Still no native streaming support

                    5. AWS Fargate (Containerized MCP)

                    Approach: Fully containerize the MCP server and deploy on AWS Fargate via ECS or EKS. Suitable for agents requiring persistent sessions and streaming2.

                    Dockerfile:

                    FROM python:3.9-slim WORKDIR /app COPY requirements.txt ./ RUN pip install -r requirements.txt COPY mcp_server.py ./ CMD ["python", "mcp_server.py"]

                    mcp_server.py Example:

                    from mcp.server.fastmcp import FastMCP mcp = FastMCP("fargate-mcp", stateless_http=True, port=8080) @mcp.tool() def echo(message: str) -> str: return message if __name__ == "__main__": mcp.run(transport="streamable-http")

                    CDK Deployment Example:

                    from aws_cdk import ( aws_ecs as ecs, aws_ecs_patterns as patterns, aws_ecr_assets as assets, Stack ) from constructs import Construct class FargateStack(Stack): def __init__(self, scope, id, **kwargs): super().__init__(scope, id, **kwargs) docker_image = assets.DockerImageAsset(self, "McpImage", directory="path/to/dockerfile" ) patterns.ApplicationLoadBalancedFargateService( self, "FargateMCPService", task_image_options={ "image": ecs.ContainerImage.from_docker_image_asset(docker_image) }, desired_count=2, public_load_balancer=True )

                    Benefits:

                    • Full streaming and persistent workloads supported
                    • Scalability with ECS or EKS
                    • Suitable for production-grade deployments

                    Limitations:

                    • More costly than Lambda for low-usage patterns
                    • Slightly longer deploy cycles
                    • Requires container orchestration setup

                    6. Choosing the Right Model

                    • Use Native Lambda for testing, short-lived tasks, low traffic.
                    • Add Web Adapter when integrating with web apps or frameworks.
                    • Choose Fargate for streaming, persistent workloads, or higher performance needs43.

                    7. Key Considerations

                    • Security & Observability: Lambda and Fargate integrate with X-Ray, CloudWatch, IAM, and OpenTelemetry23.
                    • Cost & Scaling: Lambda is cost-effective for burst workloads; Fargate favors steady or stream-heavy usage4.
                    • Developer Experience: Native Lambda offers fastest dev loop; Fargate supports production parity and long-lived workflows3.

                    8. Next Steps

                    1. Start with a proof-of-concept using native Lambda + FastMCP.
                    2. Expand to include frameworks via Web Adapter for structured web API support.
                    3. Move to a containerized MCP + agent deployment on Fargate via Strands’ sample projects1.

                    References

                    Footnotes

                    1. AWS “Open Protocols for Agent Interoperability Part 3: Strands Agents & MCP” 2

                    2. Heeki Park, “Building an MCP server as an API developer” 2 3

                    3. Ran Isenberg, “Serverless MCP on AWS: Lambda vs. Fargate for Agentic AI Workloads” 2 3 4 5

                    4. Vivek V, “Implementing Nova Act MCP Server on ECS Fargate” 2 3

                    Written by Om-Shree-0709 (@Om-Shree-0709)