Skip to main content
Glama

example-mcp-server-streamable-http

by yigitkonur

STDIO | Stateful HTTP | Stateless HTTP | SSE


🎓 MCP Stateful HTTP Streamable Server - Educational Reference

A Production-Ready Model Context Protocol Server Teaching Hybrid Storage, Distributed Systems, and Resilient Error Handling

Learn by building a world-class, horizontally-scalable MCP server that is robust by design.

🎯 Project Goal & Core Concepts

This repository is a masterclass in building distributed systems with the Model Context Protocol. It is a comprehensive reference implementation that demonstrates how to build a robust, scalable, and fault-tolerant MCP server using a stateful, hybrid-storage architecture.

This project is designed to teach five core concepts:

  1. 🏗️ Clean Architecture: Master a clean separation of concerns by organizing code into a types.ts for data contracts and a server.ts for application logic.
  2. ⚙️ Hybrid Storage (Strategy Pattern): Implement a system that runs with zero dependencies locally (in-memory) and seamlessly transitions to a distributed architecture using Redis for production.
  3. 🔒 Scalability & Zero-Downtime: Build a system that scales horizontally and supports zero-downtime deployments by externalizing state and eliminating the need for "sticky sessions".
  4. ⚡ Advanced State Management: Learn critical patterns for distributed systems, including storage abstraction (ISessionStore), race condition prevention, and just-in-time instance reconstruction.
  5. 🛡️ Resilience & Predictability: Implement a robust error handling strategy using specific, typed errors and a global error boundary to build a server that fails gracefully and predictably.

🤔 When to Use This Architecture

This stateful, distributed architecture is the ideal choice for complex, high-availability applications:

  • Enterprise Applications: Systems that require persistent user sessions and must remain available during deployments or node failures.
  • Collaborative Tools: Scenarios where multiple users or agents interact with a shared context that must be centrally managed.
  • Multi-Turn Conversational Agents: Complex chatbots or agents that need to remember the entire history of an interaction to provide coherent responses.
  • Any system where losing session state or failing unpredictably is unacceptable.

🚀 Quick Start

This server is designed to work in two modes: a simple local mode and a scalable production mode.

1. Zero-Configuration Local Development

Run the server instantly on your machine with zero external dependencies.

# Clone the repository git clone https://github.com/yigitkonur/example-mcp-server-streamable-http cd example-mcp-server-streamable-http # Install dependencies npm install # Start the server in development mode (uses in-memory storage) npm run dev # The server starts on port 1453 with the message: # ✅ Using In-Memory for single-node state management.

2. Production Mode with Docker & Redis

Test the full distributed architecture using the provided Docker Compose setup.

# Make sure Docker is running on your machine # This single command starts the server and a Redis instance docker-compose up --build # The server starts on port 1453 and connects to the Redis container: # ✅ Using Redis for distributed state management. # INFO: Redis Client Connected

📐 Architecture Overview

Code & File Structure

This project follows a clean architecture with a deliberate separation of concerns.

src/ ├── types.ts # Data Contracts: Interfaces, Zod Schemas, Custom Errors └── server.ts # Application Logic: Storage Impls, Server Factory, HTTP Endpoints

Key Architectural Principles

  1. Storage Abstraction (Strategy Pattern): The core application logic is decoupled from the storage mechanism (in-memory vs. Redis) via an ISessionStore interface defined in types.ts.
  2. Stateless Nodes, Stateful System: Individual server nodes hold only a temporary cache of session objects. The authoritative state lives in a central store (Redis), allowing the system as a whole to be stateful and resilient.
  3. Just-in-Time Reconstruction: Any server node can handle a request for any session ID. If a session is not in a node's local cache, it is reconstructed on-the-fly from the central store. This eliminates the need for sticky sessions.
  4. Predictable Error Handling: The server uses a multi-layered error strategy. It throws specific, typed errors for known failure modes (like an invalid session) and uses a global Express error handler as a safety net to catch all unexpected issues, ensuring the client always receives a secure, protocol-compliant error response.

Architectural Diagrams

Single-Node Mode (Local Development)
┌─────────────────────────────────────────┐ │ Express Server │ │ ┌─────────────────────────────────────┐ │ │ │ Global Error Handler (Safety Net) │ │ │ └─────────────────────────────────────┘ │ │ Rate Limiting | CORS | Health Checks │ ├─────────────────────────────────────────┤ │ In-Memory Session Store │ │ (Ephemeral Map<id, Data>) │ ├─────────────────────────────────────────┤ │ Per-Session MCP Server Instances │ │ (Cached in an in-memory Map) │ └─────────────────────────────────────────┘
Distributed Mode (Production)
Load Balancer (No Sticky Sessions) | +--------------+--------------+ | | | ┌─────────┐ ┌─────────┐ ┌─────────┐ │ Server A│ │ Server B│ │ Server C│ └─────────┘ └─────────┘ └─────────┘ | | | +--------------+--------------+ | ┌────────────────────────────┐ │ Redis Cluster │ │ (Authoritative Session & │ │ Event Store) │ └────────────────────────────┘

🔧 Core Implementation Patterns

This section highlights the most important code patterns that define this architecture.

Pattern 1: Storage Abstraction (ISessionStore)

The Principle: Code to an interface, not a concrete implementation. This decouples our application logic from the storage technology.

The Implementation (src/types.ts):

// The contract that any storage backend must adhere to. export interface ISessionStore { get(sessionId: string): Promise<SessionData | null>; set(sessionId: string, data: SessionData): Promise<void>; // ... and other methods } // Application logic in server.ts only ever interacts with this interface.

Pattern 2: Just-in-Time Instance Reconstruction

The Principle: To achieve horizontal scalability without sticky sessions, any server node must be able to handle a request for any active session.

The Implementation (src/server.ts):

// DRY Implementation: Single helper function eliminates code duplication async function getOrCreateInstances( sessionId: string, ): Promise<{ transport: StreamableHTTPServerTransport; server: McpServer }> { // 1. Check high-performance local cache first let instances = sessionInstances.get(sessionId); if (instances) return instances; // 2. Verify session exists in authoritative persistent store const sessionData = await sessionStore.get(sessionId); if (!sessionData) { throw new SessionNotFoundError('Session does not exist or has expired.', { sessionId }); } // 3. Reconstruct instances from persistent state console.log(`Reconstructing instances for session ${sessionId} on this node`); // ... reconstruction logic ... return instances; } // Used consistently across POST, GET, and DELETE endpoints const instances = await getOrCreateInstances(sessionId);

Pattern 3: Critical Initialization Order

The Principle: To prevent race conditions in a distributed system, the session record must be saved to the persistent store before the McpServer instance is created.

The Implementation (src/server.ts):

// 1. A new session request arrives. Generate a session ID. const newSessionId = randomUUID(); // 2. Create the initial session data object. const sessionData = createNewSessionData(); // 3. CRITICAL: Persist the session data to Redis/memory FIRST. await sessionStore.set(newSessionId, sessionData); // 4. NOW it's safe to create the McpServer instance, which may need to read this data. const server = await createMCPServer(newSessionId);

Pattern 4: Resilient & Predictable Error Handling

The Principle: A robust server fails predictably. We use specific error types for known issues and a global safety net for everything else.

The Implementation:

1. Define Custom, Specific Errors (src/types.ts): We create a hierarchy of error classes to represent distinct failure modes.

// A base class for all our application's errors. export class CalculatorServerError extends McpError { /* ... */ } // A specific error for when a session is not found. export class SessionNotFoundError extends CalculatorServerError { /* ... */ } // A specific error for when a database/Redis operation fails. export class StorageOperationFailedError extends CalculatorServerError { /* ... */ }

2. Throw Specific Errors in Logic (src/server.ts): Instead of returning generic errors, our code throws these specific types.

// Inside an HTTP handler... const sessionData = await sessionStore.get(sessionId); if (!sessionData) { // This is a known, predictable failure. Throw the specific error. throw new SessionNotFoundError('Session not found or expired', { sessionId }); }

3. Complete Error Boundary Coverage (src/server.ts): Every endpoint throws specific errors instead of direct HTTP responses, ensuring 100% coverage by the global handler. This prevents any error from bypassing our safety net.

// All endpoints throw errors instead of sending responses directly if (!sessionId) { throw new McpError(ErrorCode.InvalidRequest, 'Mcp-Session-Id header is required'); } // Global Express middleware catches ALL errors app.use((err: Error, req: Request, res: Response, next: express.NextFunction) => { // 1. Log the full, detailed error for our internal records. console.error('[GLOBAL ERROR HANDLER] Unhandled error caught:', err); // 2. Handle specific error types with proper codes and context let code = ErrorCode.InternalError; let message = 'An internal server error occurred.'; let data: unknown = undefined; if (err instanceof CalculatorServerError) { code = err.code; message = err.message; data = err.context; // Include contextual information for debugging } else if (err instanceof McpError) { code = err.code; message = err.message; data = err.data; } // 3. Always send protocol-compliant JSON-RPC error responses res.status(500).json({ jsonrpc: '2.0', id: rpcId, error: { code, message, data } }); });

📊 Features Implemented

This server implements a comprehensive set of capabilities to demonstrate a production-grade system.

FeatureDescriptionKey Pattern Demonstrated
Hybrid StorageSwitches between in-memory and Redis via USE_REDIS env var.Strategy Pattern and environment-based configuration.
Persistent HistoryCalculation history is saved as part of the session data.Stateful Tool Use: Tools modify session state which is then persisted.
Gold-Standard Error HandlingComplete error boundary coverage with typed errors and comprehensive TSDoc documentation.Multi-Layered Defense: Custom error hierarchy + global safety net + contextual error data.
DRY Code ArchitectureSingle getOrCreateInstances helper eliminates reconstruction logic duplication.Maintainability: Critical patterns abstracted into reusable, well-documented functions.
Health Checks/health endpoint reports server status, including Redis connectivity.Observability: Providing critical system status for monitoring.
Prometheus Metrics/metrics endpoint exposes mcp_active_sessions and more.Monitoring: Exposing key performance indicators for a metrics scraper.
Complete DocumentationEvery tool, resource, and prompt handler documents exact failure modes with @throws annotations.Predictable APIs: Clear contracts for all failure scenarios.

🧪 Testing & Validation

Health & Metrics

Verify the server's operational status and view its metrics. The /health endpoint is aware of the storage mode.

# Check basic health (works in both modes) curl http://localhost:1453/health # In Redis mode, a healthy response will include: # "storageMode": "redis", "redis": "ready" # Check Prometheus-style metrics curl http://localhost:1453/metrics

Manual Request (with curl)

Use curl to test the full session lifecycle.

# 1. Initialize a session and capture the Mcp-Session-Id header SESSION_ID=$(curl -i -X POST http://localhost:1453/mcp \ -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"1.0.0"}}' \ | grep -i Mcp-Session-Id | awk '{print $2}' | tr -d '\r') echo "Acquired Session ID: $SESSION_ID" # 2. Use the session ID to call a tool curl -X POST http://localhost:1453/mcp \ -H "Content-Type: application/json" \ -H "Mcp-Session-Id: $SESSION_ID" \ -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"calculate","arguments":{"a":100,"b":50,"op":"add"}}}'

Interactive Testing with MCP Inspector

Use the official inspector to interactively test the stateful server.

# The inspector will handle the session ID automatically. npx @modelcontextprotocol/inspector --cli http://localhost:1453/mcp

🏭 Deployment & Configuration

Configuration

The server is configured using environment variables.

VariableDescriptionDefault
PORTThe port for the HTTP server to listen on.1453
USE_REDISSet to true to enable Redis for distributed state.false
REDIS_URLThe connection string for the Redis instance.redis://localhost:6379
LOG_LEVELLogging verbosity (debug, info, warn, error).info
CORS_ORIGINAllowed origin for CORS. Use a specific domain in production.*
SAMPLE_TOOL_NAME(Educational) Demonstrates dynamic tool registration via environment variables. When set, adds a simple echo tool with the specified name that takes a value parameter and returns test string print: {value}. This pattern shows how MCP servers can be configured at runtime.None

Production Deployment

This server is designed for high-availability, horizontally-scaled deployments.

  • Containerization: The multi-stage Dockerfile creates a lean, secure production image. The docker-compose.yml file is ready for multi-replica scaling (docker-compose up --scale mcp-server=4).
  • Load Balancing: Deploy behind any standard load balancer. Sticky sessions are not required due to the "Just-in-Time Reconstruction" architecture.
  • Zero-Downtime Updates: Because session state is externalized to Redis, you can perform rolling deployments of new server versions without interrupting or losing active user sessions.

Related MCP Servers

View all related MCP servers

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/yigitkonur/example-mcp-server-streamable-http'

If you have feedback or need assistance with the MCP directory API, please join our Discord server