Skip to main content
Glama

MCP Gateway

CI Crates.io Rust License: MIT

Give your AI access to every tool it needs -- without burning your context window or building MCP servers.

demo

MCP Gateway sits between your AI client and your tools. Instead of loading hundreds of tool definitions into every request, the AI gets 4 meta-tools and discovers the right one on demand -- like searching an app store instead of installing every app.

Why

The context window is the bottleneck. Every MCP tool you connect costs ~150 tokens of context overhead. Connect 20 servers with 100+ tools and you've burned 15,000 tokens before the conversation starts -- on tool definitions the AI probably won't use this turn.

Worse: context limits force you to choose which tools to connect. You leave tools out because they don't fit -- and your AI makes worse decisions because it can't reach the right data.

MCP Gateway removes that tradeoff entirely.

Without Gateway

With Gateway

Tools in context

Every definition, every request

4 meta-tools (~400 tokens)

Token overhead

~15,000 tokens (100 tools)

~400 tokens -- 97% savings

Cost at scale

~$0.22/request (Opus input)

~$0.006/request -- $219 saved per 1K

Practical tool limit

20-50 tools (context pressure)

Unlimited -- discovered on demand

Connect a new REST API

Build an MCP server (days)

Drop a YAML file or import an OpenAPI spec (minutes)

Changing MCP config

Restart AI session, lose context

Restart gateway (~8ms), session stays alive

When one tool breaks

Cascading failures

Circuit breakers isolate it

Why not...

Alternative

What it does

Why MCP Gateway is different

Direct MCP connections

Each server connected individually

Every tool definition loaded every request. 100 tools = 15K tokens burned. Gateway: 4 tools, always.

Claude's ToolSearch

Built-in deferred tool loading

Only works with tools already configured. Gateway adds unlimited backends + REST APIs without MCP servers.

Archestra

Cloud-hosted MCP registry

Requires cloud account, sends data to third party. Gateway is local-only, zero external dependencies.

Kong / Portkey

General API gateways

Not MCP-aware. No meta-tool discovery, no tool search, no capability YAML system.

Building fewer MCP servers

Reduce tool count manually

You lose capabilities. Gateway lets you keep everything and pay the token cost of 4.

Key Benefits

1. Unlimited Tools, Minimal Tokens

The gateway exposes 4 meta-tools. Your AI searches for what it needs (gateway_search_tools), then invokes it (gateway_invoke). Tool definitions load on demand, not upfront. Connect 500 tools and pay the token cost of 4.

This isn't just a cost optimization -- it's a capability unlock. Without the gateway, you pick which 20-30 tools fit in context. With it, the AI has access to everything and dynamically selects the best tool for each task.

Token math (Claude Opus @ $15/M input tokens):

  • Without: 100 tools × 150 tokens × 1,000 requests = 15M tokens = $225

  • With: 4 meta-tools × 100 tokens × 1,000 requests = 0.4M tokens = $6

2. Any REST API → MCP Tool -- No Code

Most APIs will never get a dedicated MCP server. The gateway turns any REST API into a tool your AI can use:

From an OpenAPI spec (automatic):

mcp-gateway cap import stripe-openapi.yaml --output capabilities/ --prefix stripe
# Generates one capability YAML per endpoint, ready to use

From a YAML definition (~30 seconds):

name: stock_quote
description: Get real-time stock price

providers:
  primary:
    service: rest
    config:
      base_url: https://finnhub.io
      path: /api/v1/quote
      method: GET
      params:
        symbol: "{symbol}"
        token: "{env.FINNHUB_API_KEY}"

Drop the file in a capability directory -- hot-reloaded in ~500ms, no restart needed.

The gateway ships with 42 starter capabilities -- 25 work instantly with zero configuration:

Category

Zero-Config Tools

Source

Knowledge

Weather, Wikipedia, country info, public holidays, timezones, number facts

Open-Meteo, Wikipedia, RestCountries, Nager.Date

Developer

npm packages, PyPI packages, GitHub search, QR codes, UUIDs

npm, PyPI, GitHub API, GoQR

News & Social

Hacker News, Reddit

HN API, Reddit

Finance

SEC EDGAR filings (10-K, 10-Q, 8-K, insider trades)

US Gov (free)

Geo

Geocoding, reverse geocoding

OpenStreetMap Nominatim

Reference

Book search, air quality

Open Library, OpenAQ

Plus 17 more with free-tier API keys (Brave Search, stock quotes, movies, IP geolocation, recipes, package tracking).

3. Change Your MCP Stack Without Losing Your AI Session

With Claude Code and similar tools, modifying your MCP configuration means restarting the entire AI session -- losing your conversation history, working context, and debugging flow mid-thought.

The gateway is a stable endpoint. Your AI connects once to localhost:39400. Behind it:

  • REST API capabilities (YAML files) are hot-reloaded automatically -- add, modify, or remove a file and it's live in ~500ms. Zero downtime.

  • MCP server backends (stdio/HTTP/SSE) require a gateway restart -- but the gateway starts in ~8ms. Your AI session stays connected.

Experiment with new tools, troubleshoot broken ones, swap configurations -- without ever losing context.

4. Production Resilience

One flaky MCP server shouldn't take down your entire toolchain.

Failsafe

What It Does

Circuit Breaker

Isolates failing backends after 5 errors, auto-recovers after 30s

Retry with Backoff

3 attempts with exponential backoff for transient failures

Rate Limiting

Per-backend throttling prevents quota exhaustion

Health Checks

Continuous monitoring with automatic recovery detection

Graceful Shutdown

Clean connection teardown, no orphaned processes

Concurrency Limits

Prevent backend overload under burst traffic

Architecture

┌─────────────────────────────────────────────────────────────────┐
│                     MCP Gateway (:39400)                         │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │  Meta-MCP Mode: 4 Tools → Access 100+ Tools Dynamically  │    │
│  │  • gateway_list_servers    • gateway_search_tools        │    │
│  │  • gateway_list_tools      • gateway_invoke              │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                   │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │  Failsafes: Circuit Breaker │ Retry │ Rate Limit        │    │
│  └─────────────────────────────────────────────────────────┘    │
│                              │                                   │
│         ┌────────────────────┼────────────────────┐             │
│         ▼                    ▼                    ▼             │
│  ┌─────────────┐    ┌─────────────┐    ┌─────────────┐         │
│  │   Tavily    │    │  Context7   │    │   Pieces    │         │
│  │  (stdio)    │    │   (http)    │    │   (sse)     │         │
│  └─────────────┘    └─────────────┘    └─────────────┘         │
└─────────────────────────────────────────────────────────────────┘

How It Works

The gateway exposes 4 meta-tools that replace hundreds of individual tool definitions:

Meta-Tool

Purpose

gateway_list_servers

List available backends

gateway_list_tools

List tools from a specific backend

gateway_search_tools

Search tools by keyword across all backends

gateway_invoke

Invoke any tool on any backend

Example: You have 12 MCP servers exposing 180+ tools. Without a gateway, every request carries all 180 definitions (~27,000 tokens). With the gateway, the AI discovers and invokes tools on demand:

Step 1: Search for relevant tools

{
  "method": "tools/call",
  "params": {
    "name": "gateway_search_tools",
    "arguments": { "query": "search" }
  }
}

Response:

{
  "query": "search",
  "matches": [
    { "server": "tavily",  "tool": "tavily_search",   "description": "Web search via Tavily API" },
    { "server": "brave",   "tool": "brave_web_search", "description": "Search the web with Brave" },
    { "server": "github",  "tool": "search_code",      "description": "Search code across repositories" }
  ],
  "total": 3
}

Step 2: Invoke the tool you need

{
  "method": "tools/call",
  "params": {
    "name": "gateway_invoke",
    "arguments": {
      "server": "tavily",
      "tool": "tavily_search",
      "arguments": { "query": "MCP protocol specification" }
    }
  }
}

The gateway routes the call to the Tavily backend, applies circuit breaker/retry logic, and returns the result. The AI never loaded all 180 tool schemas -- it discovered and used exactly the one it needed.

Features

Authentication

Protect your gateway with bearer tokens and/or API keys:

auth:
  enabled: true

  # Simple bearer token (good for single-user/dev)
  bearer_token: "auto"  # auto-generates, or use env:VAR_NAME, or literal

  # API keys for multi-client access
  api_keys:
    - key: "env:CLIENT_A_KEY"
      name: "Client A"
      rate_limit: 100        # requests per minute (0 = unlimited)
      backends: ["tavily"]   # restrict to specific backends (empty = all)
    - key: "my-literal-key"
      name: "Client B"
      backends: []           # all backends

  # Paths that bypass auth (always includes /health)
  public_paths: ["/health", "/metrics"]

Token Options:

  • "auto" - Auto-generate random token (logged at startup)

  • "env:VAR_NAME" - Read from environment variable

  • "literal-value" - Use literal string

Usage:

curl -H "Authorization: Bearer YOUR_TOKEN" http://localhost:39400/mcp

Per-Client Tool Scopes

Control which tools each API key can access using allowed_tools (allowlist) and denied_tools (blocklist):

auth:
  enabled: true
  api_keys:
    # Frontend app: Only search and read operations
    - key: "env:FRONTEND_KEY"
      name: "Frontend App"
      allowed_tools:
        - "search_*"        # All search tools
        - "read_*"          # All read tools
        - "tavily:*"        # All tools on tavily server
      # When allowed_tools is set, ONLY these tools are accessible

    # Bot: No filesystem or execution tools
    - key: "env:BOT_KEY"
      name: "Bot"
      denied_tools:
        - "filesystem_*"    # Block all filesystem tools
        - "exec_*"          # Block all execution tools
      # When denied_tools is set, these are blocked ON TOP of global policy

    # Data reader: Complex restrictions
    - key: "env:READER_KEY"
      name: "Data Reader"
      allowed_tools:
        - "database_*"
        - "filesystem_*"
      denied_tools:
        - "database_write"  # But no writes
        - "database_delete" # Or deletes

Pattern Matching:

  • "exact_name" - Exact match only

  • "prefix_*" - Glob pattern (matches anything starting with prefix)

  • "server:tool" - Qualified name (server-specific)

  • "server:*" - All tools on specific server

Precedence Rules:

  1. Global tool policy denies (always enforced)

  2. Per-client allowed_tools (if set, ONLY these tools allowed)

  3. Per-client denied_tools (if set, these tools blocked)

  4. Global policy default action

Security Best Practices:

  • Use allowlists for untrusted clients (frontend apps, bots)

  • Use denylists for additional restrictions on trusted clients

  • Always keep global policy enabled (security.tool_policy.enabled: true)

  • Use qualified names (server:tool) for fine-grained control

See examples/per-client-tool-scopes.yaml for complete examples.

Protocol Support

  • MCP Version: 2025-11-25 (latest)

  • Transports: stdio, Streamable HTTP, SSE

  • JSON-RPC 2.0: Full compliance

Supported Backends

MCP Gateway supports all three MCP transport types. Any MCP-compliant server works -- here are common examples:

Transport

Backend

Config Example

stdio

@anthropic/mcp-server-tavily

command: "npx -y @anthropic/mcp-server-tavily"

stdio

@modelcontextprotocol/server-filesystem

command: "npx -y @modelcontextprotocol/server-filesystem /path"

stdio

@modelcontextprotocol/server-github

command: "npx -y @modelcontextprotocol/server-github"

stdio

@modelcontextprotocol/server-postgres

command: "npx -y @modelcontextprotocol/server-postgres"

stdio

@modelcontextprotocol/server-brave-search

command: "npx -y @modelcontextprotocol/server-brave-search"

HTTP

Any Streamable HTTP server

http_url: "http://localhost:8080/mcp"

SSE

Pieces, LangChain, etc.

http_url: "http://localhost:39300/sse"

stdio backends are spawned as child processes. HTTP and SSE backends connect to already-running servers. Set env: for API keys, headers: for auth tokens, and cwd: for working directories -- see the Configuration Reference below.

Quick Start

Installation

Homebrew (macOS/Linux):

brew tap MikkoParkkola/tap
brew install mcp-gateway

Cargo:

cargo install mcp-gateway

Docker:

docker run -v /path/to/servers.yaml:/config.yaml \
  ghcr.io/mikkoparkkola/mcp-gateway:latest \
  --config /config.yaml

Binary (from GitHub Releases):

# macOS ARM64
curl -L https://github.com/MikkoParkkola/mcp-gateway/releases/latest/download/mcp-gateway-darwin-arm64 -o mcp-gateway
chmod +x mcp-gateway

Usage

# Start with configuration file
mcp-gateway --config servers.yaml

# Override port
mcp-gateway --config servers.yaml --port 8080

# Debug logging
mcp-gateway --config servers.yaml --log-level debug

Configuration

Create servers.yaml:

server:
  port: 39400

meta_mcp:
  enabled: true

failsafe:
  circuit_breaker:
    enabled: true
    failure_threshold: 5
  retry:
    enabled: true
    max_attempts: 3

backends:
  tavily:
    command: "npx -y @anthropic/mcp-server-tavily"
    description: "Web search"
    env:
      TAVILY_API_KEY: "${TAVILY_API_KEY}"

  context7:
    http_url: "http://localhost:8080/mcp"
    description: "Documentation lookup"

Client Configuration

Point your MCP client to the gateway:

{
  "mcpServers": {
    "gateway": {
      "type": "http",
      "url": "http://localhost:39400/mcp"
    }
  }
}

API Reference

Endpoints

Endpoint

Method

Description

/health

GET

Health check with backend status

/mcp

POST

Meta-MCP mode (dynamic discovery)

/mcp/{backend}

POST

Direct backend access

Health Check Response

{
  "status": "healthy",
  "version": "2.0.0",
  "backends": {
    "tavily": {
      "name": "tavily",
      "running": true,
      "transport": "stdio",
      "tools_cached": 3,
      "circuit_state": "Closed",
      "request_count": 42
    }
  }
}

Configuration Reference

Server

Field

Type

Default

Description

host

string

127.0.0.1

Bind address

port

u16

39400

Listen port

request_timeout

duration

30s

Request timeout

shutdown_timeout

duration

30s

Graceful shutdown timeout

Meta-MCP

Field

Type

Default

Description

enabled

bool

true

Enable Meta-MCP mode

cache_tools

bool

true

Cache tool lists

cache_ttl

duration

5m

Cache TTL

Failsafe

Field

Type

Default

Description

circuit_breaker.enabled

bool

true

Enable circuit breaker

circuit_breaker.failure_threshold

u32

5

Failures before open

circuit_breaker.success_threshold

u32

3

Successes to close

circuit_breaker.reset_timeout

duration

30s

Half-open delay

retry.enabled

bool

true

Enable retries

retry.max_attempts

u32

3

Max retry attempts

retry.initial_backoff

duration

100ms

Initial backoff

retry.max_backoff

duration

10s

Max backoff

rate_limit.enabled

bool

true

Enable rate limiting

rate_limit.requests_per_second

u32

100

RPS per backend

Backend

Field

Type

Required

Description

command

string

*

Stdio command

http_url

string

*

HTTP/SSE URL

description

string

Human description

enabled

bool

Default: true

timeout

duration

Request timeout

idle_timeout

duration

Hibernation delay

env

map

Environment variables

headers

map

HTTP headers

cwd

string

Working directory

*One of command or http_url required

Environment Variables

Variable

Description

MCP_GATEWAY_CONFIG

Config file path

MCP_GATEWAY_PORT

Override port

MCP_GATEWAY_HOST

Override host

MCP_GATEWAY_LOG_LEVEL

Log level

MCP_GATEWAY_LOG_FORMAT

text or json

Metrics

With --features metrics:

curl http://localhost:39400/metrics

Exposes Prometheus metrics for:

  • Request count/latency per backend

  • Circuit breaker state changes

  • Rate limiter rejections

  • Active connections

Performance

MCP Gateway is a local Rust proxy -- it adds minimal overhead between your client and backends.

Metric

Value

Notes

Startup time

~8ms

Measured with hyperfine (benchmarks)

Binary size

7.1 MB

Release build with LTO, stripped

Gateway overhead

<2ms per request

Local routing + JSON-RPC parsing (does not include backend latency)

Memory

Low

Async I/O via tokio; no per-request allocations for routing

The gateway overhead is the time spent inside the proxy itself (request parsing, backend lookup, failsafe checks, response forwarding). Actual end-to-end latency depends on the backend -- a stdio subprocess adds ~10-50ms for process I/O, while an HTTP backend adds only network round-trip time.

For detailed benchmarks, see docs/BENCHMARKS.md.

Troubleshooting

Backend won't connect

stdio backend fails to start:

# Test the command directly to verify it works
npx -y @anthropic/mcp-server-tavily

# Check the gateway logs for the actual error
mcp-gateway --config servers.yaml --log-level debug

Common causes: missing npx/node, missing API key environment variable, or incorrect command path.

HTTP/SSE backend unreachable:

  • Verify the backend server is running and listening on the configured URL.

  • Check that http_url includes the full path (e.g., http://localhost:8080/mcp, not just http://localhost:8080).

  • If the backend requires auth, set headers: in the backend config.

Circuit breaker is open

When a backend fails 5 times consecutively, the circuit breaker opens and rejects requests for 30 seconds. Check the health endpoint to see circuit state:

curl http://localhost:39400/health | jq '.backends'

To adjust thresholds:

failsafe:
  circuit_breaker:
    failure_threshold: 10   # more tolerance before opening
    reset_timeout: "15s"    # shorter recovery window

Debugging requests

Enable debug logging to see every request routed through the gateway:

mcp-gateway --config servers.yaml --log-level debug

This shows: backend selection, tool invocations, circuit breaker state changes, retry attempts, and rate limiter decisions.

  • Verify the backend is running: check gateway_list_servers output.

  • Tool lists are cached (default 5 minutes). Restart the gateway or wait for cache expiry after adding new backends.

  • Confirm the backend responds to tools/list -- some servers require initialization first.

Building

git clone https://github.com/MikkoParkkola/mcp-gateway
cd mcp-gateway
cargo build --release

Contributing

Contributions are welcome. The short version:

  1. Fork and branch -- git checkout -b feature/your-feature

  2. Test -- cargo test (all tests must pass)

  3. Lint -- cargo fmt && cargo clippy -- -D warnings

  4. PR -- open a pull request against main with a clear description

  5. Changelog -- add an entry to CHANGELOG.md for user-facing changes

Look for issues labeled good first issue or help wanted to get started. For larger changes, open an issue first to discuss the approach.

Full details: CONTRIBUTING.md

License

MIT License - see LICENSE for details.

Credits

Created by Mikko Parkkola

Implements Model Context Protocol version 2025-11-25.

Changelog | Releases

-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MikkoParkkola/mcp-gateway'

If you have feedback or need assistance with the MCP directory API, please join our Discord server