Skip to main content
Glama
FareedKhan-dev

production-grade-mcp-agentic-system

๐Ÿ›๏ธ Production-Grade MCP Server + Agentic System

A reference implementation of an MCP server designed to actually ship

Multi-tenant ยท Authenticated ยท Observable ยท Rate-limited ยท Cached ยท Circuit-broken ยท Governed

Python 3.11+ MCP 2026 License: MIT Docker


๐Ÿ“– Full Step-by-Step Blog Walkthrough

This repository is the companion codebase for a long-form blog post that walks through every single component end to end, with every line of code explained in context. Start there if you want to understand the "why" behind the architecture before reading the code.

๐Ÿ”— Building a Production-Grade MCP Server Architecture with Agentic System โ†’


๐ŸŽฏ What This Is

Most MCP tutorials end with a @tool decorator that returns "hello world". That is fine for a demo. It is not what ships.

This repository is a reference implementation of an MCP server designed to run in production: multi-tenant, authenticated, observable, rate-limited, cached, circuit-broken, and governed. It exposes a company's heterogeneous data layer (Postgres, Elasticsearch, S3, vector DB) to AI agents as a single, secure tool surface, and ships with a four-agent support copilot (Planner โ†’ Retriever โ†’ Synthesizer โ†’ Critic) that uses it end to end.

The codebase is deliberately organised around twelve components that keep showing up on the 3 AM pager when teams skip them. Each one lives in its own module and can be read, replaced, or extended independently.


Related MCP server: Search MCP Server

๐Ÿ—๏ธ Architecture Overview

The complete production-grade system: MCP server dispatch pipeline on the right, four-agent orchestrator on the left, data plane on top, observability on the bottom, identity and governance as crosscutting concerns.


๐Ÿงฉ The 12 Components

#

Component

Lives in

What it gives you

1

๐Ÿšช Transport & Session Layer

server.py

stdio for local, Streamable HTTP for remote, horizontal-scale-friendly sessions

2

๐Ÿ” Authentication Server

auth/oauth.py

OAuth 2.1 + PKCE, short-lived JWTs, JWKS validation

3

โš–๏ธ Authorization & Policy Engine

auth/policy.py

Tool-level RBAC, tenant-scoped ABAC, deny-by-default

4

๐Ÿ“š Tool Registry & Discovery

tools/registry.py

Dynamic toolsets, .well-known capability metadata

5

โœ… Input Validation Layer

validation/schemas.py

Pydantic schemas, enum constraints, agent-adversarial input as default threat model

6

๐Ÿ”ง Tool Execution Engine

tools/base.py

Three-level hierarchy (atomic / composed / workflow)

7

๐Ÿ”„ Circuit Breaker & Retry

reliability/

Closed โ†’ open โ†’ half-open, Adaptive Timeout Budget Allocation

8

๐Ÿšฆ Rate Limiting & Quotas

ratelimit/limiter.py

Redis token-bucket (Lua-atomic), per-tenant and per-tool

9

โšก Caching Layer

cache/manager.py

Two-tier (L1 in-process, L2 Redis), stampede prevention

10

๐Ÿงฑ Structured Error Framework

errors/framework.py

Machine-readable errors with retryable and hint fields

11

๐Ÿ”ญ Observability Stack

observability/

OpenTelemetry traces, Prometheus metrics, audit logs

12

๐Ÿ›ก๏ธ Governance & Multi-Tenancy

governance/

Tenant isolation, approval gates, outbound HTTP allowlisting


๐Ÿ“– Diving Deeper, Section by Section

Each diagram below links back to the corresponding section in the blog, where every line of code is walked through in detail.

๐Ÿ“ฆ Data Persistence Layer

Postgres + Row-Level Security ยท Tenant isolation at the DB layer

๐Ÿšช Transport & Session Layer

Dual transport ยท Stateless session ยท Middleware chain

๐Ÿ” Authentication, Policy & Governance

OAuth 2.1 ยท YAML policies ยท Human-in-the-loop approvals

๐Ÿ”ง Tool Execution Engine

Three-level hierarchy ยท Atomic ยท Composed ยท Workflow

๐Ÿ”„ Reliability Layer

Circuit breakers ยท Retry with jitter ยท ATBA budget allocator

โšก Rate Limiting & Caching

Redis token bucket ยท Two-tier cache ยท Stampede lock

๐Ÿ”ญ Observability Stack

OpenTelemetry ยท Prometheus ยท Audit logs ยท One trace ID

๐Ÿค– Multi-Agentic Architecture

Four-agent design ยท Planner ยท Retriever ยท Synthesizer ยท Critic

๐ŸŽผ The Orchestrator Flow

End-to-end agent orchestration with one bounded revise loop


๐Ÿš€ Quick Start

Prerequisites

  • Docker & Docker Compose

  • Python 3.11+ (only for running the CLI locally)

  • An Anthropic API key (for the agent layer)

1. Clone and Configure

git clone https://github.com/FareedKhan-dev/production-grade-mcp-agentic-system.git
cd production-grade-mcp-agentic-system
cp .env.example .env

Edit .env and set at minimum:

  • ANTHROPIC_API_KEY โ€” for the agent layer

  • ATLAS_AUTH_JWKS_URL โ€” your OAuth 2.1 provider's JWKS endpoint (or leave default for dev)

2. Bring Up the Stack

docker compose up -d

That brings up the full local environment:

Service

URL

What it is

๐Ÿ›๏ธ MCP Server

http://localhost:8080/mcp

Streamable HTTP endpoint

๐Ÿ” Discovery

http://localhost:8080/.well-known/mcp-server

Unauthenticated capability metadata

๐Ÿ“Š Metrics

http://localhost:8080/metrics

Prometheus scrape target

โค๏ธ Health

http://localhost:8080/healthz

Liveness probe

๐Ÿ”ญ Jaeger

http://localhost:16686

Distributed tracing UI

๐Ÿ“ˆ Grafana

http://localhost:3000

Metrics dashboards (admin / admin)

๐Ÿ—„๏ธ MinIO Console

http://localhost:9001

S3-compatible storage UI

3. Run the Support Copilot CLI

pip install -e .

export ATLAS_MCP_URL=http://localhost:8080
export ATLAS_MCP_TOKEN=dev-token
export ATLAS_TENANT=acme
export ANTHROPIC_API_KEY=sk-ant-...

atlas-copilot "Why was the refund on order o_9002 for CUST-1001 delayed?"

You will see the four agents run end-to-end, the final draft printed with [S1][S2] citations, and a full trace summary including token counts, tool calls, and the run_id that ties back to Jaeger.

4. Connect from Claude Desktop / Cursor

Add this to your MCP host config:

{
  "mcpServers": {
    "production-mcp": {
      "type": "http",
      "url": "http://localhost:8080/mcp",
      "headers": {
        "Authorization": "Bearer ${ATLAS_MCP_TOKEN}",
        "X-Tenant-Id": "acme"
      }
    }
  }
}

๐Ÿ“‚ Repository Structure

.
โ”œโ”€โ”€ ๐Ÿ“„ README.md
โ”œโ”€โ”€ ๐Ÿณ docker-compose.yml          # Full local stack: app + data + observability
โ”œโ”€โ”€ ๐Ÿณ Dockerfile                  # Two-stage build, non-root runtime
โ”œโ”€โ”€ ๐Ÿ“œ LICENSE
โ”œโ”€โ”€ ๐Ÿ“ฆ pyproject.toml              # Dependencies, dev tools, CLI entry points
โ”œโ”€โ”€ โš™๏ธ  .env.example                # Every setting documented by component
โ”‚
โ”œโ”€โ”€ ๐Ÿ”ง config/                     # Runtime configuration (hot-reloadable)
โ”‚   โ”œโ”€โ”€ http_allowlist.yaml       # Per-tenant outbound HTTP allowlist
โ”‚   โ””โ”€โ”€ policy.yaml               # YAML-driven authorization policies
โ”‚
โ”œโ”€โ”€ ๐Ÿšข deploy/                     # Deployment sidecar configs
โ”‚   โ”œโ”€โ”€ otel/config.yaml          # OpenTelemetry Collector pipeline
โ”‚   โ”œโ”€โ”€ prometheus/prometheus.yml # Prometheus scrape targets
โ”‚   โ””โ”€โ”€ sql/init.sql              # Schema + RLS policies + seed data
โ”‚
โ”œโ”€โ”€ ๐Ÿ“š docs/                       # Deep-dive documentation
โ”‚   โ”œโ”€โ”€ AGENT_SYSTEM.md           # Multi-agent orchestrator internals
โ”‚   โ”œโ”€โ”€ ARCHITECTURE.md           # The 12 components in detail
โ”‚   โ””โ”€โ”€ DEPLOYMENT.md             # K8s, Cloudflare Workers, bare-metal
โ”‚
โ”œโ”€โ”€ ๐Ÿง  src/atlas_mcp/              # Main application source
โ”‚   โ”œโ”€โ”€ config.py                 # Centralized typed settings
โ”‚   โ”œโ”€โ”€ server.py                 # โšก Component 1: Transport & dispatch
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿค– agents/                 # Four-agent support copilot
โ”‚   โ”‚   โ”œโ”€โ”€ planner.py            # Emits retrieval plan JSON
โ”‚   โ”‚   โ”œโ”€โ”€ retriever.py          # Bounded tool-calling loop
โ”‚   โ”‚   โ”œโ”€โ”€ synthesizer.py        # Drafts reply with citations
โ”‚   โ”‚   โ”œโ”€โ”€ critic.py             # Approves or sends one revise
โ”‚   โ”‚   โ”œโ”€โ”€ orchestrator.py       # Wires the four agents together
โ”‚   โ”‚   โ”œโ”€โ”€ mcp_client.py         # Thin JSON-RPC MCP client
โ”‚   โ”‚   โ”œโ”€โ”€ memory.py             # STM (Redis) + LTM (vector)
โ”‚   โ”‚   โ””โ”€โ”€ cli.py                # atlas-copilot CLI entry point
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ” auth/                   # Components 2 + 3
โ”‚   โ”‚   โ”œโ”€โ”€ oauth.py              # JWT + JWKS validation
โ”‚   โ”‚   โ”œโ”€โ”€ middleware.py         # Bearer token extraction
โ”‚   โ”‚   โ””โ”€โ”€ policy.py             # YAML-driven policy engine
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ›ก๏ธ  governance/             # Component 12
โ”‚   โ”‚   โ”œโ”€โ”€ tenant.py             # Tenant pinning middleware
โ”‚   โ”‚   โ””โ”€โ”€ approval.py           # Human-in-the-loop gate
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ”ง tools/                  # Components 4 + 6
โ”‚   โ”‚   โ”œโ”€โ”€ registry.py           # In-memory tool index + discovery
โ”‚   โ”‚   โ”œโ”€โ”€ base.py               # Tool abstract base + metadata
โ”‚   โ”‚   โ”œโ”€โ”€ atomic/               # Level 1: one backend each
โ”‚   โ”‚   โ”œโ”€โ”€ composed/             # Level 2: deterministic chains
โ”‚   โ”‚   โ””โ”€โ”€ workflow/             # Level 3: multi-step procedures
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ”„ reliability/            # Component 7
โ”‚   โ”‚   โ”œโ”€โ”€ circuit_breaker.py    # 3-state machine per tool
โ”‚   โ”‚   โ”œโ”€โ”€ retry.py              # Exponential backoff + jitter
โ”‚   โ”‚   โ””โ”€โ”€ atba.py               # Adaptive Timeout Budget Allocation
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿšฆ ratelimit/              # Component 8
โ”‚   โ”‚   โ””โ”€โ”€ limiter.py            # Redis token bucket (Lua-atomic)
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ โšก cache/                   # Component 9
โ”‚   โ”‚   โ””โ”€โ”€ manager.py            # L1 + L2 cache with stampede lock
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿงฑ errors/                 # Component 10
โ”‚   โ”‚   โ””โ”€โ”€ framework.py          # Structured Error Recovery (SERF)
โ”‚   โ”‚
โ”‚   โ”œโ”€โ”€ ๐Ÿ”ญ observability/          # Component 11
โ”‚   โ”‚   โ”œโ”€โ”€ tracing.py            # OpenTelemetry spans
โ”‚   โ”‚   โ”œโ”€โ”€ metrics.py            # Prometheus instruments
โ”‚   โ”‚   โ””โ”€โ”€ audit.py              # Structured JSONL audit log
โ”‚   โ”‚
โ”‚   โ””โ”€โ”€ โœ… validation/             # Component 5
โ”‚       โ””โ”€โ”€ schemas.py            # Tool call envelope
โ”‚
โ””โ”€โ”€ ๐Ÿงช tests/                      # Narrow tests, load-bearing properties
    โ”œโ”€โ”€ test_circuit_breaker.py   # State machine transitions
    โ”œโ”€โ”€ test_errors.py            # SERF wire format + retry semantics
    โ””โ”€โ”€ test_policy.py            # Deny-beats-allow + default-deny

๐ŸŽจ Tech Stack

Layer

Technology

Language

Python 3.11+

Web framework

Starlette + Uvicorn

MCP SDK

mcp>=1.2.0

Auth

PyJWT + Authlib (OAuth 2.1 resource server)

Validation

Pydantic v2 + Pydantic Settings

Database

asyncpg (PostgreSQL 16 with RLS)

Search

Elasticsearch 8 (async client)

Vector DB

Qdrant

Object storage

aioboto3 (MinIO / S3)

Cache + queues

Redis 7 (redis[hiredis])

Reliability

tenacity (retries) + custom breaker + custom ATBA

Tracing

OpenTelemetry SDK + OTLP exporter

Metrics

prometheus_client

Logging

structlog (JSON)

LLM

Anthropic Messages API (Claude)


๐Ÿงช Testing

The test suite is deliberately narrow, covering the three load-bearing safety properties:

pip install -e ".[dev]"
pytest -v
  • test_circuit_breaker.py โ€” state machine transitions, retryable vs deterministic error classification

  • test_errors.py โ€” SERF wire format, retry semantics, MCP-level error data

  • test_policy.py โ€” default-deny, deny-beats-allow, glob matching, PII condition blocking


๐Ÿ›ฃ๏ธ Production Deployment

For running this in an actual production environment (managed Postgres, real OAuth provider, SIEM integration, Kubernetes), see docs/DEPLOYMENT.md.

Key swaps between local dev and production:

Local (docker-compose)

Production

Dev JWT issuer

WorkOS AuthKit / Auth0 / Keycloak

MinIO

AWS S3 / GCS / Azure Blob

Local Postgres

AWS RDS / Cloud SQL / Supabase

Redis container

Upstash / ElastiCache / MemoryDB

Local OTel collector

Datadog / Honeycomb / Grafana Cloud

File-based audit log

Splunk / Chronicle / SIEM of choice


๐Ÿ“š Documentation


๐Ÿ“œ License

MIT. See LICENSE.


โญ If this helped you, please consider starring the repo

Built with โ˜• and a lot of 3 AM debugging

๐Ÿ“– Read the full blog walkthrough ยท ๐Ÿ› Report an issue ยท ๐Ÿ’ฌ Start a discussion

A
license - permissive license
-
quality - not tested
B
maintenance

Maintenance

โ€“Maintainers
โ€“Response time
โ€“Release cycle
โ€“Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/FareedKhan-dev/production-grade-mcp-agentic-system'

If you have feedback or need assistance with the MCP directory API, please join our Discord server