Skip to main content
Glama

MCP Gateway

Cut 83-89% of your Claude Code context window overhead from MCP tool schemas.

Every MCP server you register dumps its full JSON tool schema into your context window — every conversation, whether you use those tools or not. If you have 5 servers with 30+ tools each, that's thousands of tokens burned before you type a single character.

MCP Gateway replaces N tool schemas with 3-4 dispatch tools that proxy requests to your existing MCP servers. The underlying servers stay exactly the same. You just stop paying the token tax.

The Problem

Before (real numbers from a multi-account setup):
  Google Workspace × 3 accounts  = 142 tools × 3 = 426 tool schemas
  Telegram                       = 92 tool schemas
  Linear × 3 accounts            = 42 tools × 3  = 126 tool schemas
  ─────────────────────────────────────────────────
  Total                          = 644 tool schemas = ~57,000 tokens

After:
  Google Workspace gateway       = 3 tool schemas
  Services gateway (TG + Linear) = 4 tool schemas
  ─────────────────────────────────────────────────
  Total                          = 7 tool schemas   = ~6,200 tokens

Savings: 89% fewer tokens, every single conversation.

How It Works

Instead of registering each MCP server directly in your Claude Code config, you register a single gateway server that proxies tool calls to the underlying servers on demand.

Claude sees 3-4 generic tools (gw, gw_discover, gw_batch or tg, linear, etc.) instead of hundreds of specialized ones. When Claude needs to call a specific tool, it uses the dispatch tool with the tool name as a parameter. The gateway forwards the call to the right backend.

Two patterns are included, depending on what your upstream MCP server supports:

Pattern 1: CLI Dispatch (cli_gateway.py)

For MCP servers that support a --cli mode (subprocess per call). Each tool invocation spawns a short-lived process. Good for servers like google-workspace-mcp that have built-in CLI modes.

Features:

  • Multi-account routing (one gateway, N credential sets)

  • Auto-injection of per-account parameters (e.g., user_google_email)

  • Tool-to-service mapping for faster cold starts (only loads the needed module)

  • Discovery with caching

  • Batch execution (parallel tool calls in one request)

Pattern 2: Persistent MCP Client (persistent_gateway.py)

For MCP servers without CLI mode. Maintains persistent subprocess connections to upstream MCP servers, avoiding cold-start latency on every call. Uses the MCP SDK's ClientSession with AsyncExitStack for lifecycle management.

Features:

  • Lazy connection (connects on first use, not at startup)

  • Auto-reconnect on connection failure

  • Multi-service routing (multiple MCP servers behind one gateway)

  • Multi-account support per service

  • Tool discovery with caching

Quick Start

1. Install

# Clone
git clone https://github.com/block-town/mcp-gateway.git
cd mcp-gateway

# Install dependencies
pip install fastmcp pyyaml python-dotenv
# or
uv pip install fastmcp pyyaml python-dotenv

2. Configure

Copy the example configs and fill in your details:

cp config.example.yaml config.yaml
# Edit config.yaml with your accounts, paths, and credentials

3. Choose Your Pattern

CLI Dispatch (for servers with --cli mode):

cp cli_gateway.py server.py
# Edit server.py — update the tool descriptions with your account names and common tools

Persistent Client (for servers without CLI mode):

cp persistent_gateway.py server.py
# Edit server.py — update service names and tool descriptions

4. Register in Claude Code

Add to your ~/.claude/settings.json (global) or project .mcp.json:

{
  "mcpServers": {
    "gateway": {
      "command": "python3",
      "args": ["/path/to/mcp-gateway/server.py"]
    }
  }
}

Then remove the original MCP server entries that the gateway now proxies.

5. Use

# Before (142 tools polluting your context):
gw_discover("gmail")     → see available Gmail tools + params
gw("work", "search_gmail_messages", {"query": "is:unread"})
gw_batch("work", [
  {"tool": "search_gmail_messages", "params": {"query": "is:unread"}},
  {"tool": "get_events", "params": {"time_min": "2025-01-01"}}
])

# Persistent gateway:
tg("send_message", {"chat_id": "123", "text": "hello"})
linear("work", "linear_getIssues", {"teamId": "TEAM-1"})

Configuration

CLI Gateway (config.example.yaml — accounts mode)

# Path to the upstream MCP server
upstream_dir: "/path/to/google-workspace-mcp"

# Command to run the upstream server in CLI mode
runner: "uv"

accounts:
  personal:
    client_id: "your-oauth-client-id"
    client_secret: "your-oauth-client-secret"
    credentials_dir: "/path/to/credentials/personal"
    email: "you@gmail.com"
  work:
    client_id: "your-work-oauth-client-id"
    client_secret: "your-work-oauth-client-secret"
    credentials_dir: "/path/to/credentials/work"
    email: "you@company.com"

Persistent Gateway (config.example.yaml — services mode)

services:
  telegram:
    command: "uv"
    args: ["--directory", "/path/to/telegram-mcp", "run", "main.py"]
    env_file: "/path/to/telegram-mcp/.env"

  linear:
    command: "npx"
    args: ["-y", "@tacticlaunch/mcp-linear"]
    accounts:
      work:
        LINEAR_API_TOKEN: "lin_api_xxxxx"
      personal:
        LINEAR_API_TOKEN: "lin_api_yyyyy"

Architecture

┌─────────────────────────────────────────────────────────────┐
│  Claude Code                                                │
│                                                             │
│  Context window sees: 3-4 tool schemas (~6K tokens)         │
│  Instead of:          644 tool schemas (~57K tokens)         │
│                                                             │
│  gw("work", "search_gmail_messages", {"query": "..."})      │
│  tg("send_message", {"chat_id": "...", "text": "..."})      │
└──────────────────┬──────────────────────────────────────────┘
                   │
         ┌─────────▼─────────┐
         │   MCP Gateway     │
         │   (FastMCP)       │
         │                   │
         │   3-4 tools that  │
         │   dispatch to     │
         │   upstream MCP    │
         │   servers         │
         └───┬─────┬─────┬──┘
             │     │     │
    ┌────────▼┐ ┌──▼───┐ ┌▼────────┐
    │ Google  │ │ Tele-│ │ Linear  │
    │ MCP     │ │ gram │ │ MCP     │
    │ (CLI)   │ │ MCP  │ │ (×N     │
    │         │ │      │ │ accts)  │
    └─────────┘ └──────┘ └─────────┘

When to Use This

Good fit:

  • You have 3+ MCP servers registered and context window pressure is real

  • Multi-account setups (same server, different credentials)

  • Servers with 30+ tools where you use maybe 5-10 regularly

Not worth it:

  • Single MCP server with <10 tools

  • You rarely hit context limits

  • The MCP server is already tiny

Token Math

Each MCP tool schema is roughly 200-800 tokens of JSON (tool name, description, parameter schema with types/descriptions/required fields). To measure your own overhead:

  1. Count your registered tools: look at your settings.json and .mcp.json files

  2. Estimate ~400 tokens per tool (conservative average)

  3. Multiply by every conversation you start

A gateway tool schema is ~800-900 tokens (larger description with embedded cheat-sheet), but you only have 3-4 of them instead of hundreds.

Adapting to Your Stack

The two gateway files are templates. Fork and modify:

  1. Change the tool namesgw/tg/linear are just conventions. Name them whatever makes sense.

  2. Update the docstrings — The tool descriptions are the cheat-sheet Claude sees. List your most common tools and their params there.

  3. Add more services — The persistent gateway pattern works with any MCP server. Add a new service block in config, build a new client, expose new dispatch tools.

  4. Strip what you don't need — If you only have one account, remove multi-account routing. If you don't need batch, remove gw_batch.

License

MIT

-
security - not tested
F
license - not found
-
quality - not tested

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/block-town/mcp-gateway'

If you have feedback or need assistance with the MCP directory API, please join our Discord server