AI MCP Gateway

ARCHITECTURE.md•13.7 KiB

# Architecture Guide ## Overview AI MCP Gateway is a multi-model orchestrator that routes AI requests through an N-layer architecture, optimizing for cost while maintaining quality. It supports multiple interfaces (HTTP API, MCP, CLI) and provides a comprehensive admin dashboard. ## System Components ### 1. Entry Points ``` ┌─────────────────────────────────────────────────────────────────┐ │ Entry Points │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │ │ │ HTTP API │ │ MCP Server │ │ CLI │ │ │ │ :3000 │ │ (stdio) │ │ (no server needed) │ │ │ └─────────────┘ └─────────────┘ └─────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` - **HTTP API** (`src/api/server.ts`): Express server for RESTful endpoints - **MCP Server** (`src/mcp/server.ts`): Model Context Protocol for Claude Desktop - **CLI** (`src/cli/index.ts`): Command-line interface for management tasks ### 2. Router (N-Layer) ``` ┌─────────────────────────────────────────────────────────────────┐ │ N-Layer Router │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Request → Complexity Detection → Layer Selection → Model Pick │ │ │ │ L0 (Free) → Priority 0, 1, 2... (sorted ASC) │ │ L1 (Cheap) → Priority 0, 1, 2... (sorted ASC) │ │ L2 (Standard) → Priority 0, 1, 2... (sorted ASC) │ │ L3 (Premium) → Priority 0, 1, 2... (sorted ASC) │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` **Key files:** - `src/routing/router.ts`: Main routing logic - `src/routing/cost.ts`: Cost tracking and budget enforcement - `src/db/model-config.ts`: Model configuration service (DB-backed) **How it works:** 1. Request arrives with optional layer/model hints 2. Router detects complexity (simple → L0, complex → higher) 3. Selects appropriate layer based on complexity and config 4. Picks first available model by priority (lowest number first) 5. Falls back to next layer if all models fail ### 3. LLM Providers ``` ┌─────────────────────────────────────────────────────────────────┐ │ LLM Providers │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ OpenRouter │ │ OpenAI │ │ Anthropic │ │ │ │ (free +) │ │ (GPT-4) │ │ (Claude) │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ │ ┌──────────────┐ │ │ │ OSS Local │ (Ollama, LM Studio, etc.) │ │ └──────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` **Key files:** - `src/tools/llm/client.ts`: Unified LLM client - `src/tools/llm/openrouter.ts`: OpenRouter provider - `src/tools/llm/openai.ts`: OpenAI provider - `src/tools/llm/anthropic.ts`: Anthropic provider - `src/tools/llm/oss-local.ts`: Local model provider ### 4. Data Layer ``` ┌─────────────────────────────────────────────────────────────────┐ │ Data Layer │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────────────┐ ┌──────────────────────────────┐ │ │ │ Redis │ │ PostgreSQL │ │ │ │ - Session cache │ │ - Model configurations │ │ │ │ - Rate limiting │ │ - Analytics & metrics │ │ │ │ - Temp storage │ │ - Provider configs │ │ │ └──────────────────────┘ │ - Alert rules │ │ │ │ - Terminal connections │ │ │ └──────────────────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` **Key files:** - `src/db/postgres.ts`: PostgreSQL connection pool - `src/db/model-config.ts`: Model CRUD operations - `src/cache/redis.ts`: Redis caching layer ### 5. Admin Dashboard ``` ┌─────────────────────────────────────────────────────────────────┐ │ Admin Dashboard (React) │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ Pages: │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Overview │ │ Models │ │ Providers │ │ │ │ (metrics) │ │ (CRUD) │ │ (config) │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ │ ┌──────────────┐ ┌──────────────┐ ┌──────────────┐ │ │ │ Alerts │ │ Tokens │ │ Web Terminal │ │ │ │ (rules) │ │ (API keys) │ │ (SSH/shell) │ │ │ └──────────────┘ └──────────────┘ └──────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` **Location:** `admin-dashboard/src/` ## Request Flow ``` 1. Client Request │ ▼ 2. HTTP API / MCP Server │ ▼ 3. Router.route(request, context) │ ├─── Detect complexity │ ├─── Select layer (L0-L3) │ ├─── Pick model (by priority) │ └─── Call LLM provider │ ▼ 4. LLM Response │ ▼ 5. Track cost & metrics │ ▼ 6. Return to client ``` ## Model Priority System Models within each layer are sorted by `priority` (ascending): ```sql SELECT * FROM model_configs WHERE layer = 'L0' AND enabled = true ORDER BY priority ASC, relative_cost ASC; ``` - **Priority 0**: First choice (selected before Priority 1) - **Priority 1**: Second choice - **Priority 2**: Third choice - etc. This allows fine-grained control over which model is used first within a layer. ## Configuration Flow ``` 1. Environment Variables (.env) │ ▼ 2. Provider Manager (src/config/provider-manager.ts) │ ▼ 3. Model Config Service (src/db/model-config.ts) │ ▼ 4. Router uses DB-backed config ``` ## Docker Architecture ``` ┌─────────────────────────────────────────────────────────────────┐ │ Docker Compose Stack │ ├─────────────────────────────────────────────────────────────────┤ │ │ │ ┌──────────────────┐ │ │ │ ai-mcp-gateway │ ◄─── Port 3000 (API) │ │ │ (Node.js) │ │ │ └────────┬─────────┘ │ │ │ │ │ ┌────────┼──────────────────────────────────────┐ │ │ │ │ │ │ │ │ ┌─────▼─────┐ ┌───────────────────┐ │ │ │ │ │ postgres │ │ redis │ │ │ │ │ │ :5432 │ │ :6379 │ │ │ │ │ └───────────┘ └───────────────────┘ │ │ │ │ │ │ │ └───────────────────────────────────────────────┘ │ │ │ │ ┌──────────────────┐ │ │ │ ai-mcp-dashboard │ ◄─── Port 5173 (Dashboard) │ │ │ (Nginx) │ │ │ └──────────────────┘ │ │ │ └─────────────────────────────────────────────────────────────────┘ ``` ## Security Considerations 1. **API Keys**: Stored in environment variables, never in code 2. **Database**: Credentials via DATABASE_URL 3. **Admin Auth**: JWT-based authentication for dashboard 4. **Terminal**: Session-based with proper cleanup ## Performance Optimizations 1. **Redis Caching**: Model configs cached to reduce DB queries 2. **Connection Pooling**: PostgreSQL uses connection pool 3. **Lazy Loading**: CLI commands don't load server modules 4. **Dynamic Imports**: Server modules loaded only when needed

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/babasida246/ai-mcp-gateway'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

ARCHITECTURE.md•13.7 KiB