Skip to main content
Glama

AI MCP Gateway

Cost-Optimized Multi-Model Orchestrator with Stateless Architecture

An intelligent Model Context Protocol (MCP) server and HTTP API that orchestrates multiple AI models (free and paid) with dynamic N-layer routing, cross-checking, cost optimization, and stateless context management via Redis + PostgreSQL.

TypeScript Node.js MCP License


✨ Features

Core Features

  • 🎯 Smart Routing: Dynamic N-layer routing based on task complexity and quality requirements

  • πŸ’° Cost Optimization: Prioritizes free/cheap models, escalates only when necessary

  • βœ… Cross-Checking: Multiple models review each other's work for higher quality

  • πŸ”§ Code Agent: Specialized AI agent for coding tasks with TODO-driven workflow

  • πŸ§ͺ Test Integration: Built-in Vitest and Playwright test runners

  • πŸ“Š Metrics & Logging: Track costs, tokens, and performance

  • πŸ”„ Self-Improvement: Documents patterns, bugs, and routing heuristics

  • πŸ› οΈ Extensible: Easy to add new models, providers, and tools

NEW: Stateless Architecture

  • πŸ—„οΈ Redis Cache Layer: Hot storage for LLM responses, context summaries, routing hints

  • πŸ’Ύ PostgreSQL Database: Cold storage for conversations, messages, LLM calls, analytics

  • 🌐 HTTP API Mode: Stateless REST API with /v1/route, /v1/code-agent, /v1/chat endpoints

  • πŸ“¦ Context Management: Two-tier context with hot (Redis) + cold (DB) layers

  • πŸ”— Handoff Packages: Optimized inter-layer communication for model escalation

  • πŸ“ TODO Tracking: Persistent GitHub Copilot-style TODO lists with Redis/DB storage


πŸ“‹ Table of Contents


πŸš€ Quick Start

Prerequisites

  • Node.js >= 20.0.0

  • npm or pnpm (recommended)

  • API keys for desired providers (OpenRouter, Anthropic, OpenAI)

  • Optional: Redis (for caching)

  • Optional: PostgreSQL (for persistence)

Installation

# Clone the repository git clone https://github.com/yourusername/ai-mcp-gateway.git cd ai-mcp-gateway # Install dependencies npm install # Copy environment template cp .env.example .env # Edit .env and add your API keys and database settings nano .env

Build

# Build the project npm run build # Or run in development mode npm run dev

πŸ—οΈ Architecture

Stateless Design

The AI MCP Gateway is designed as a stateless application with external state management:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ AI MCP Gateway (Stateless) β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ MCP Server β”‚ β”‚ HTTP API β”‚ β”‚ β”‚ β”‚ (stdio) β”‚ β”‚ (REST) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Routing Engine β”‚ β”‚ β”‚ β”‚ Context Manager β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β” β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β” β”Œβ”€β”€β”€β–Όβ”€β”€β”€β”€β” β”‚ Redis β”‚ β”‚ DB β”‚ β”‚ LLMs β”‚ β”‚ (Hot) β”‚ β”‚(Cold) β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Two-Tier Context Management

  1. Hot Layer (Redis)

    • Context summaries (conv:summary:{conversationId})

    • Recent messages cache (conv:messages:{conversationId})

    • LLM response cache (llm:cache:{model}:{hash})

    • TODO lists (todo:list:{conversationId})

    • TTL: 30-60 minutes

  2. Cold Layer (PostgreSQL)

    • Full conversation history

    • All messages with metadata

    • Context summaries (versioned)

    • LLM call logs (tokens, cost, duration)

    • Routing rules and analytics

    • Persistent storage


πŸ”„ Dual Mode Operation

The gateway supports two modes:

1. MCP Mode (stdio)

Standard Model Context Protocol server for desktop clients.

npm run start:mcp # or npm start

Configure in Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{ "mcpServers": { "ai-mcp-gateway": { "command": "node", "args": ["/path/to/ai-mcp-gateway/dist/index.js"] } } }

2. HTTP API Mode

Stateless REST API for web services and integrations.

npm run start:api # or MODE=api npm start

API runs on http://localhost:3000 (configurable via API_PORT).


🌐 HTTP API Usage

Endpoints

POST /v1/route

Intelligent model selection and routing.

curl -X POST http://localhost:3000/v1/route \ -H "Content-Type: application/json" \ -d '{ "conversationId": "conv-123", "message": "Explain async/await in JavaScript", "userId": "user-1", "qualityLevel": "normal" }'

Response:

{ "result": { "response": "Async/await is...", "model": "anthropic/claude-sonnet-4", "provider": "anthropic" }, "routing": { "summary": "L0 -> primary model", "fromCache": false }, "context": { "conversationId": "conv-123" }, "performance": { "durationMs": 1234, "tokens": { "input": 50, "output": 200 }, "cost": 0.002 } }

POST /v1/code-agent

Specialized coding assistant.

curl -X POST http://localhost:3000/v1/code-agent \ -H "Content-Type: application/json" \ -d '{ "conversationId": "conv-123", "task": "Create a React component for user profile", "files": ["src/components/UserProfile.tsx"] }'

POST /v1/chat

General chat endpoint with context.

curl -X POST http://localhost:3000/v1/chat \ -H "Content-Type: application/json" \ -d '{ "conversationId": "conv-123", "message": "What did we discuss earlier?" }'

GET /v1/context/:conversationId

Retrieve conversation context.

curl http://localhost:3000/v1/context/conv-123

GET /health

Health check endpoint.

curl http://localhost:3000/health

Response:

{ "status": "ok", "redis": true, "database": true, "timestamp": "2025-11-22T06:42:00.000Z" }
"args": ["/path/to/ai-mcp-gateway/dist/index.js"] }

} }

### Start the Server ```bash # Run the built server pnpm start # Or use the binary directly node dist/index.js

πŸ—οΈ Architecture

High-Level Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ MCP Client β”‚ β”‚ (Claude Desktop, VS Code, etc.) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ MCP Protocol β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ AI MCP Gateway Server β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Tools Registry β”‚ β”‚ β”‚ β”‚ β€’ code_agent β€’ run_vitest β”‚ β”‚ β”‚ β”‚ β€’ run_playwright β€’ fs_read/write β”‚ β”‚ β”‚ β”‚ β€’ git_diff β€’ git_status β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ Routing Engine β”‚ β”‚ β”‚ β”‚ β€’ Task classification β”‚ β”‚ β”‚ β”‚ β€’ Layer selection (L0β†’L1β†’L2β†’L3) β”‚ β”‚ β”‚ β”‚ β€’ Cross-check orchestration β”‚ β”‚ β”‚ β”‚ β€’ Auto-escalation β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ LLM Clients β”‚ β”‚ β”‚ β”‚ β€’ OpenRouter β€’ Anthropic β”‚ β”‚ β”‚ β”‚ β€’ OpenAI β€’ OSS Local β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β” β”‚ Free Models β”‚ β”‚ Paid Modelsβ”‚ β”‚Local Modelsβ”‚ β”‚ (Layer L0) β”‚ β”‚(Layer L1-L3)β”‚ β”‚ (Layer L0)β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Components

1. MCP Server (src/mcp/)

  • Handles MCP protocol communication

  • Registers and dispatches tools

  • Manages request/response lifecycle

2. Routing Engine (src/routing/)

  • Classifies tasks by type, complexity, quality

  • Selects optimal model layer

  • Orchestrates cross-checking between models

  • Auto-escalates when needed

3. LLM Clients (src/tools/llm/)

  • Unified interface for multiple providers

  • Handles API calls, token counting, cost calculation

  • Supports: OpenRouter, Anthropic, OpenAI, local models

4. Tools (src/tools/)

  • Code Agent: Main AI coding assistant

  • Testing: Vitest and Playwright runners

  • File System: Read/write/list operations

  • Git: Diff and status operations

5. Logging & Metrics (src/logging/)

  • Winston-based structured logging

  • Cost tracking and alerts

  • Performance metrics


πŸ› οΈ Available MCP Tools

The gateway exposes 14 MCP tools for various operations:

Code & Development Tools

Tool

Description

Key Parameters

code_agent

AI coding assistant with TODO tracking

task

,

context

,

quality

Testing Tools

Tool

Description

Key Parameters

run_vitest

Execute Vitest unit/integration tests

testPath

,

watch

run_playwright

Execute Playwright E2E tests

testPath

File System Tools

Tool

Description

Key Parameters

fs_read

Read file contents

path

,

encoding

fs_write

Write file contents

path

,

content

fs_list

List directory contents

path

,

recursive

Git Tools

Tool

Description

Key Parameters

git_diff

Show git diff

staged

git_status

Show git status

-

NEW: Cache Tools (Redis)

Tool

Description

Key Parameters

redis_get

Get value from Redis cache

key

redis_set

Set value in Redis cache

key

,

value

,

ttl

redis_del

Delete key from Redis cache

key

NEW: Database Tools (PostgreSQL)

Tool

Description

Key Parameters

db_query

Execute SQL query

sql

,

params

db_insert

Insert row into table

table

,

data

db_update

Update rows in table

table

,

where

,

data

Tool Usage Examples

Using Redis cache:

{ "tool": "redis_set", "arguments": { "key": "user:profile:123", "value": {"name": "John", "role": "admin"}, "ttl": 3600 } }

Querying database:

{ "tool": "db_query", "arguments": { "sql": "SELECT * FROM conversations WHERE user_id = $1 LIMIT 10", "params": ["user-123"] } }

πŸ“¦ Context Management

How Context Works

  1. Conversation Initialization

    • Client sends conversationId with each request

    • Gateway checks Redis for existing context summary

    • Falls back to DB if Redis miss

    • Creates new conversation if not exists

  2. Context Storage

    • Summary: Compressed project context (stack, architecture, decisions)

    • Messages: Recent messages (last 50 in Redis, all in DB)

    • TODO Lists: Persistent task tracking

    • Metadata: User, project, timestamps

  3. Context Compression

    • When context grows large (>50 messages):

      • System generates new summary

      • Keeps only recent 5-10 messages in detail

      • Older messages summarized into context

    • Reduces token usage while maintaining relevance

  4. Context Handoff

    • When escalating between layers:

      • Creates handoff package with:

        • Context summary

        • Current task

        • Previous attempts

        • Known issues

        • Request to higher layer

      • Optimized for minimal tokens

Database Schema

-- Conversations CREATE TABLE conversations ( id TEXT PRIMARY KEY, user_id TEXT, project_id TEXT, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW(), metadata JSONB DEFAULT '{}'::jsonb ); -- Messages CREATE TABLE messages ( id SERIAL PRIMARY KEY, conversation_id TEXT REFERENCES conversations(id), role TEXT NOT NULL, content TEXT NOT NULL, metadata JSONB DEFAULT '{}'::jsonb, created_at TIMESTAMP DEFAULT NOW() ); -- Context summaries CREATE TABLE context_summaries ( id SERIAL PRIMARY KEY, conversation_id TEXT REFERENCES conversations(id), summary TEXT NOT NULL, version INTEGER DEFAULT 1, created_at TIMESTAMP DEFAULT NOW() ); -- LLM call logs CREATE TABLE llm_calls ( id SERIAL PRIMARY KEY, conversation_id TEXT REFERENCES conversations(id), model_id TEXT NOT NULL, layer TEXT NOT NULL, input_tokens INTEGER DEFAULT 0, output_tokens INTEGER DEFAULT 0, estimated_cost DECIMAL(10, 6) DEFAULT 0, duration_ms INTEGER, success BOOLEAN DEFAULT true, created_at TIMESTAMP DEFAULT NOW() ); -- TODO lists CREATE TABLE todo_lists ( id SERIAL PRIMARY KEY, conversation_id TEXT REFERENCES conversations(id), todo_data JSONB NOT NULL, created_at TIMESTAMP DEFAULT NOW(), updated_at TIMESTAMP DEFAULT NOW() );

βš™οΈ Configuration

Environment Variables

Create a .env file (use .env.example as template):

# MCP Server MCP_SERVER_NAME=ai-mcp-gateway MCP_SERVER_VERSION=0.1.0 # API Keys OPENROUTER_API_KEY=sk-or-v1-... ANTHROPIC_API_KEY=sk-ant-... OPENAI_API_KEY=sk-... # OSS/Local Models (optional) OSS_MODEL_ENDPOINT=http://localhost:11434 OSS_MODEL_ENABLED=false # Redis REDIS_HOST=localhost REDIS_PORT=6379 REDIS_PASSWORD= REDIS_DB=0 # PostgreSQL DATABASE_URL=postgresql://user:pass@localhost:5432/ai_mcp_gateway DB_HOST=localhost DB_PORT=5432 DB_NAME=ai_mcp_gateway DB_USER=postgres DB_PASSWORD= DB_SSL=false # HTTP API API_PORT=3000 API_HOST=0.0.0.0 API_CORS_ORIGIN=* # Logging LOG_LEVEL=info LOG_FILE=logs/ai-mcp-gateway.log # Routing Configuration DEFAULT_LAYER=L0 ENABLE_CROSS_CHECK=true ENABLE_AUTO_ESCALATE=true MAX_ESCALATION_LAYER=L2 # Cost Tracking ENABLE_COST_TRACKING=true COST_ALERT_THRESHOLD=1.00 # Mode MODE=mcp # or 'api' for HTTP server

Model Configuration

Edit src/config/models.ts to:

  • Add/remove models

  • Adjust layer assignments

  • Update pricing

  • Enable/disable models

Example:

{ id: 'my-custom-model', provider: 'openrouter', apiModelName: 'provider/model-name', layer: 'L1', relativeCost: 5, pricePer1kInputTokens: 0.001, pricePer1kOutputTokens: 0.002, capabilities: { code: true, general: true, reasoning: true, }, contextWindow: 100000, enabled: true, }

πŸ“– Usage

Using the Code Agent

The Code Agent is the primary tool for coding tasks:

// Example MCP client call { "tool": "code_agent", "arguments": { "task": "Create a TypeScript function to validate email addresses", "context": { "language": "typescript", "requirements": [ "Use regex pattern", "Handle edge cases", "Include unit tests" ] }, "quality": "high" } }

Response includes:

  • Generated code

  • Routing summary (which models were used)

  • Token usage and cost

  • Quality assessment

Running Tests

// Run Vitest tests { "tool": "run_vitest", "arguments": { "testPath": "tests/unit/mytest.test.ts" } } // Run Playwright E2E tests { "tool": "run_playwright", "arguments": { "testPath": "tests/e2e/login.spec.ts" } }

File Operations

// Read file { "tool": "fs_read", "arguments": { "path": "/path/to/file.ts" } } // Write file { "tool": "fs_write", "arguments": { "path": "/path/to/output.ts", "content": "console.log('Hello');" } } // List directory { "tool": "fs_list", "arguments": { "path": "/path/to/directory" } }

Git Operations

// Get diff { "tool": "git_diff", "arguments": { "staged": false } } // Get status { "tool": "git_status", "arguments": {} }

πŸ› οΈ Available Tools

Tool Name

Description

Input

code_agent

AI coding assistant with multi-model routing

task, context, quality

run_vitest

Run Vitest unit/integration tests

testPath (optional)

run_playwright

Run Playwright E2E tests

testPath (optional)

fs_read

Read file contents

path

fs_write

Write file contents

path, content

fs_list

List directory contents

path

git_diff

Get git diff

path (optional), staged (bool)

git_status

Get git status

none


🎚️ Model Layers

Layer L0 - Free/Cheapest

  • Models: Mistral 7B Free, Qwen 2 7B Free, OSS Local

  • Cost: $0

  • Use for: Simple tasks, drafts, code review

  • Capabilities: Basic code, general knowledge

Layer L1 - Low Cost

  • Models: Gemini Flash 1.5, GPT-4o Mini

  • Cost: ~$0.08-0.75 per 1M tokens

  • Use for: Standard coding tasks, refactoring

  • Capabilities: Code, reasoning, vision

Layer L2 - Mid-tier

  • Models: Claude 3 Haiku, GPT-4o

  • Cost: ~$1.38-12.5 per 1M tokens

  • Use for: Complex tasks, high-quality requirements

  • Capabilities: Advanced code, reasoning, vision

Layer L3 - Premium

  • Models: Claude 3.5 Sonnet, OpenAI o1

  • Cost: ~$18-60 per 1M tokens

  • Use for: Critical tasks, architecture design

  • Capabilities: SOTA performance, deep reasoning


πŸ’» Development

Project Structure

ai-mcp-gateway/ β”œβ”€β”€ src/ β”‚ β”œβ”€β”€ index.ts # Entry point β”‚ β”œβ”€β”€ config/ # Configuration β”‚ β”‚ β”œβ”€β”€ env.ts β”‚ β”‚ └── models.ts β”‚ β”œβ”€β”€ mcp/ # MCP server β”‚ β”‚ β”œβ”€β”€ server.ts β”‚ β”‚ └── types.ts β”‚ β”œβ”€β”€ routing/ # Routing engine β”‚ β”‚ β”œβ”€β”€ router.ts β”‚ β”‚ └── cost.ts β”‚ β”œβ”€β”€ tools/ # MCP tools β”‚ β”‚ β”œβ”€β”€ codeAgent/ β”‚ β”‚ β”œβ”€β”€ llm/ β”‚ β”‚ β”œβ”€β”€ testing/ β”‚ β”‚ β”œβ”€β”€ fs/ β”‚ β”‚ └── git/ β”‚ └── logging/ # Logging & metrics β”‚ β”œβ”€β”€ logger.ts β”‚ └── metrics.ts β”œβ”€β”€ tests/ # Tests β”‚ β”œβ”€β”€ unit/ β”‚ β”œβ”€β”€ integration/ β”‚ └── regression/ β”œβ”€β”€ docs/ # Documentation β”‚ β”œβ”€β”€ ai-orchestrator-notes.md β”‚ β”œβ”€β”€ ai-routing-heuristics.md β”‚ └── ai-common-bugs-and-fixes.md β”œβ”€β”€ playwright/ # E2E tests β”œβ”€β”€ package.json β”œβ”€β”€ tsconfig.json β”œβ”€β”€ vitest.config.ts └── playwright.config.ts

Scripts

# Development pnpm dev # Watch mode with auto-rebuild pnpm build # Build for production pnpm start # Run built server # Testing pnpm test # Run all Vitest tests pnpm test:watch # Run tests in watch mode pnpm test:ui # Run tests with UI pnpm test:e2e # Run Playwright E2E tests # Code Quality pnpm type-check # TypeScript type checking pnpm lint # ESLint pnpm format # Prettier

πŸ§ͺ Testing

Unit Tests

# Run all unit tests pnpm test # Run specific test file pnpm vitest tests/unit/routing.test.ts # Watch mode pnpm test:watch

Integration Tests

Integration tests verify interactions between components:

pnpm vitest tests/integration/

Regression Tests

Regression tests prevent previously fixed bugs from reoccurring:

pnpm vitest tests/regression/

E2E Tests

End-to-end tests using Playwright:

pnpm test:e2e

πŸ”„ Self-Improvement

The gateway includes a self-improvement system:

1. Bug Tracking (docs/ai-common-bugs-and-fixes.md)

  • Documents encountered bugs

  • Includes root causes and fixes

  • Links to regression tests

2. Pattern Learning (docs/ai-orchestrator-notes.md)

  • Tracks successful patterns

  • Records optimization opportunities

  • Documents lessons learned

3. Routing Refinement (docs/ai-routing-heuristics.md)

  • Defines routing rules

  • Documents when to escalate

  • Model capability matrix

Adding to Self-Improvement Docs

When you discover a bug or pattern:

  1. Document it in the appropriate file

  2. Create a regression test in tests/regression/

  3. Update routing heuristics if needed

  4. Run tests to verify the fix


🀝 Contributing

Contributions are welcome! Please:

  1. Fork the repository

  2. Create a feature branch

  3. Make your changes with tests

  4. Update documentation

  5. Submit a pull request

Adding a New Model

  1. Update src/config/models.ts:

    { id: 'new-model-id', provider: 'provider-name', // ... config }
  2. Add provider client if needed in src/tools/llm/

  3. Update docs/ai-routing-heuristics.md

Adding a New Tool

  1. Create tool in src/tools/yourtool/index.ts:

    export const yourTool = { name: 'your_tool', description: '...', inputSchema: { ... }, handler: async (args) => { ... } };
  2. Register in src/mcp/server.ts

  3. Add tests in tests/unit/


πŸ“„ License

MIT License - see LICENSE file for details


πŸ™ Acknowledgments


πŸ“ž Support


πŸ—ΊοΈ Roadmap

  • Token usage analytics dashboard

  • Caching layer for repeated queries

  • More LLM providers (Google AI, Cohere, etc.)

  • Streaming response support

  • Web UI for configuration and monitoring

  • Batch processing optimizations

  • Advanced prompt templates

  • A/B testing framework


Made with ❀️ for efficient AI orchestration

-
security - not tested
A
license - permissive license
-
quality - not tested

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/babasida246/ai-mcp-gateway'

If you have feedback or need assistance with the MCP directory API, please join our Discord server