Optimized for serverless deployment on AWS Lambda, enabling stateless operation with perfect horizontal scaling for calculator functionality.
Designed for deployment on Cloudflare edge functions, providing stateless calculator services with maximum scalability at the network edge.
Exposes metrics in Prometheus-compatible format through a /metrics endpoint for monitoring server performance and usage.
Optimized for deployment on Vercel edge functions, enabling serverless calculator functionality with zero state persistence between requests.
STDIO | Stateful HTTP | Stateless HTTP | SSE
🎓 MCP Stateless HTTP Streamable Server - Educational Reference
A Production-Ready Model Context Protocol Server Teaching Stateless Architecture and Scalability Best Practices
Learn by building a world-class MCP server designed for infinite scalability, security, and maintainability.
🎯 Project Goal & Core Concepts
This repository is a deeply educational reference implementation that demonstrates how to build a production-quality MCP server using a truly stateless architecture. This design is the gold standard for modern, cloud-native services.
Through a fully-functional calculator server, this project will teach you:
- 🏗️ Clean Architecture & Design: Master the "fresh instance per request" pattern for infinite scaling and learn to structure your code with a clean separation of concerns (
types.ts
for data contracts,server.ts
for logic). - ⚙️ Protocol & Transport Mastery: Correctly implement the
StreamableHTTPServerTransport
in its stateless mode, delegating all low-level protocol validation to the SDK. - 🔒 Production-Grade Security: Implement non-negotiable security layers, including rate limiting, request size validation, DNS rebinding protection, and strict CORS policies.
- ⚡ Resilient Error Handling: Implement a "fail-fast" and "no-leaks" error policy using specific, protocol-compliant
McpError
types for predictable and secure failure modes. - 📈 Production Observability: Build a server that is transparent and monitorable from day one with structured logging, health check endpoints, and Prometheus-compatible metrics.
🤔 When to Use This Architecture
A stateless architecture is the optimal choice for environments where scalability, resilience, and operational simplicity are paramount.
- Serverless Platforms: Perfect for deployment to AWS Lambda, Vercel, Google Cloud Functions, or any "Function-as-a-Service" platform.
- Auto-Scaling Environments: Ideal for container orchestrators like Kubernetes, where a Horizontal Pod Autoscaler can add or remove server replicas based on traffic, with no need for session affinity ("sticky sessions").
- High-Traffic APIs: When you need to serve a large number of independent requests and cannot be constrained by the memory or state of a single server.
- Simplified Operations: Eliminates the need for a shared state store (like Redis), reducing infrastructure complexity and maintenance overhead.
🚀 Quick Start
Prerequisites
- Node.js ≥ 20.0.0
- npm or yarn
- Docker (for containerized deployment)
Installation & Running
Essential Commands
📐 Architecture Overview
Key Principles
This server's architecture is defined by a commitment to modern best practices for building scalable and maintainable services.
- Stateless by Design: The server shares absolutely no state between requests. Every request is handled in complete isolation.
- Ephemeral Instances & Explicit Cleanup: The core of this pattern is creating a new
McpServer
andTransport
for every request. These instances are explicitly destroyed when the request completes to prevent memory leaks. - Clean Code Architecture: The codebase is intentionally split into
types.ts
(for data contracts, schemas, and constants) andserver.ts
(for runtime logic), promoting maintainability and a clear separation of concerns. - Resilient Error Handling: The server uses a "fail-fast" and "no-leaks" error policy, throwing specific
McpError
types for predictable failures and wrapping all unexpected errors in a generic, safe response. - Production Observability: The server exposes
/health
and/metrics
endpoints from the start, making it transparent and easy to monitor in production environments.
Architectural Diagrams
Logical Request Flow
This diagram shows how a single request is processed in our stateless model.
Code Structure
This diagram shows how the source code is organized for maximum clarity and maintainability.
🔧 Core Implementation Patterns
This section highlights the most important, production-ready patterns demonstrated in this repository.
Pattern 1: The "Per-Request Instance" Lifecycle
The Principle: To guarantee statelessness and prevent memory leaks, we follow a strict create-use-destroy lifecycle for server and transport objects within the scope of a single HTTP request handler.
The Implementation:
Pattern 2: Resilient & Secure Error Handling
The Principle: The server follows a "fail-fast" and "no-leaks" error policy. Predictable errors are reported with specific, protocol-compliant codes, while unexpected errors are caught and sanitized to prevent leaking internal details.
The Implementation:
- Specific, Actionable Errors: Predictable user errors, like division by zero, throw a specific
McpError
. This allows the client application to understand the failure and prompt the user for a correction. - The "Safety Net" for Unexpected Errors: The main
handleMCPRequest
function is wrapped in atry...catch
block that acts as a safety net. It catches any unhandled exception, logs it internally, and returns a generic, safe error to the client.
Pattern 3: Strict Separation of Concerns (types.ts
vs. server.ts
)
The Principle: A clean architecture separates data contracts (the "what") from implementation logic (the "how"). This makes the code easier to maintain, test, and reason about.
The Implementation:
src/types.ts
: This file contains only data definitions. It has no runtime logic. It defines all Zod schemas for input validation, shared constants, and TypeScript interfaces. It is the stable foundation of the application.src/server.ts
: This file contains all runtime logic. It imports the data contracts fromtypes.ts
and uses them to implement the server's behavior, including the Express app, middleware, tool handlers, and startup sequence.
Pattern 4: Production-Ready Observability
The Principle: A production service must be transparent. This server includes built-in endpoints for health checks and metrics, allowing it to be easily integrated into modern monitoring and orchestration systems.
The Implementation:
/health
: A simple endpoint that returns a200 OK
status with basic uptime and memory information. Perfect for load balancers and container readiness probes./metrics
: Exposes key performance indicators (KPIs) like request duration and tool execution times in a Prometheus-compatible format, ready to be scraped by monitoring systems like Prometheus or Grafana.
🧪 Testing & Validation
Health & Metrics
Verify the server's operational status.
Manual Request
Send a direct curl
request to test a tool's functionality.
Testing a Success Case
Testing an Error Case
This command intentionally triggers the InvalidParams
error to demonstrate the server's resilient error handling.
Interactive Testing with MCP Inspector
Use the official inspector for a rich, interactive testing experience.
🏭 Deployment & Configuration
Configuration
The server is configured using environment variables, making it perfect for containerized deployments.
Variable | Description | Default |
---|---|---|
PORT | The port for the HTTP server to listen on. | 1071 |
LOG_LEVEL | Logging verbosity (debug , info , warn , error ). | info |
CORS_ORIGIN | Allowed origin for CORS. Must be restricted in production. | * |
RATE_LIMIT_MAX | Max requests per window per IP. | 1000 |
RATE_LIMIT_WINDOW | Rate limit window in milliseconds. | 900000 (15 min) |
NODE_ENV | Sets the environment. Use production for Express optimizations. | development |
SAMPLE_TOOL_NAME | (Educational) Demonstrates dynamic tool registration via environment variables. When set, adds a simple echo tool with the specified name that takes a value parameter and returns test string print: {value} . This pattern shows how MCP servers can be configured at runtime. | None |
Deployment
This server is designed from the ground up for modern, scalable deployment platforms. The included multi-stage Dockerfile
and docker-compose.yml
provide a secure and efficient container.
- Serverless: The
handleMCPRequest
function can be exported directly as a serverless function handler for platforms like Vercel or AWS Lambda. - Kubernetes: The Docker image is ready to be deployed with a Horizontal Pod Autoscaler (HPA), allowing the cluster to automatically scale replicas up and down based on CPU or request load.
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
example-mcp-server-streamable-http-stateless
Related MCP Servers
- TypeScriptISC License
- -securityAlicense-qualityA server that implements the Model Context Protocol (MCP) with StreamableHTTP transport, enabling standardized interaction with model services through a RESTful API interface.Last updated -1081JavaScriptMIT License
- -securityFlicense-qualityA minimal, containerized MCP server that exposes a Streamable HTTP transport with API key authentication, allowing secure access to MCP endpoints.Last updated -Python
- -securityFlicense-qualityexample-mcp-server-streamable-httpLast updated -TypeScript