Provides optional Prometheus metrics support for monitoring the gateway's performance, health status, and backend operations.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP Gatewaylist all the connected servers and their tools"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP Gateway
Give your AI access to every tool it needs -- without burning your context window or building MCP servers.

MCP Gateway sits between your AI client and your tools. Instead of loading hundreds of tool definitions into every request, the AI gets 4 meta-tools and discovers the right one on demand -- like searching an app store instead of installing every app.
Why
The context window is the bottleneck. Every MCP tool you connect costs ~150 tokens of context overhead. Connect 20 servers with 100+ tools and you've burned 15,000 tokens before the conversation starts -- on tool definitions the AI probably won't use this turn.
Worse: context limits force you to choose which tools to connect. You leave tools out because they don't fit -- and your AI makes worse decisions because it can't reach the right data.
MCP Gateway removes that tradeoff entirely.
Without Gateway | With Gateway | |
Tools in context | Every definition, every request | 4 meta-tools (~400 tokens) |
Token overhead | ~15,000 tokens (100 tools) | ~400 tokens -- 97% savings |
Cost at scale | ~$0.22/request (Opus input) | ~$0.006/request -- $219 saved per 1K |
Practical tool limit | 20-50 tools (context pressure) | Unlimited -- discovered on demand |
Connect a new REST API | Build an MCP server (days) | Drop a YAML file or import an OpenAPI spec (minutes) |
Changing MCP config | Restart AI session, lose context | Restart gateway (~8ms), session stays alive |
When one tool breaks | Cascading failures | Circuit breakers isolate it |
Why not...
Alternative | What it does | Why MCP Gateway is different |
Direct MCP connections | Each server connected individually | Every tool definition loaded every request. 100 tools = 15K tokens burned. Gateway: 4 tools, always. |
Claude's ToolSearch | Built-in deferred tool loading | Only works with tools already configured. Gateway adds unlimited backends + REST APIs without MCP servers. |
Archestra | Cloud-hosted MCP registry | Requires cloud account, sends data to third party. Gateway is local-only, zero external dependencies. |
Kong / Portkey | General API gateways | Not MCP-aware. No meta-tool discovery, no tool search, no capability YAML system. |
Building fewer MCP servers | Reduce tool count manually | You lose capabilities. Gateway lets you keep everything and pay the token cost of 4. |
Key Benefits
1. Unlimited Tools, Minimal Tokens
The gateway exposes 4 meta-tools. Your AI searches for what it needs (gateway_search_tools), then invokes it (gateway_invoke). Tool definitions load on demand, not upfront. Connect 500 tools and pay the token cost of 4.
This isn't just a cost optimization -- it's a capability unlock. Without the gateway, you pick which 20-30 tools fit in context. With it, the AI has access to everything and dynamically selects the best tool for each task.
Token math (Claude Opus @ $15/M input tokens):
Without: 100 tools × 150 tokens × 1,000 requests = 15M tokens = $225
With: 4 meta-tools × 100 tokens × 1,000 requests = 0.4M tokens = $6
2. Any REST API → MCP Tool -- No Code
Most APIs will never get a dedicated MCP server. The gateway turns any REST API into a tool your AI can use:
From an OpenAPI spec (automatic):
mcp-gateway cap import stripe-openapi.yaml --output capabilities/ --prefix stripe
# Generates one capability YAML per endpoint, ready to useFrom a YAML definition (~30 seconds):
name: stock_quote
description: Get real-time stock price
providers:
primary:
service: rest
config:
base_url: https://finnhub.io
path: /api/v1/quote
method: GET
params:
symbol: "{symbol}"
token: "{env.FINNHUB_API_KEY}"Drop the file in a capability directory -- hot-reloaded in ~500ms, no restart needed.
The gateway ships with 42 starter capabilities -- 25 work instantly with zero configuration:
Category | Zero-Config Tools | Source |
Knowledge | Weather, Wikipedia, country info, public holidays, timezones, number facts | Open-Meteo, Wikipedia, RestCountries, Nager.Date |
Developer | npm packages, PyPI packages, GitHub search, QR codes, UUIDs | npm, PyPI, GitHub API, GoQR |
News & Social | Hacker News, Reddit | HN API, Reddit |
Finance | SEC EDGAR filings (10-K, 10-Q, 8-K, insider trades) | US Gov (free) |
Geo | Geocoding, reverse geocoding | OpenStreetMap Nominatim |
Reference | Book search, air quality | Open Library, OpenAQ |
Plus 17 more with free-tier API keys (Brave Search, stock quotes, movies, IP geolocation, recipes, package tracking).
3. Change Your MCP Stack Without Losing Your AI Session
With Claude Code and similar tools, modifying your MCP configuration means restarting the entire AI session -- losing your conversation history, working context, and debugging flow mid-thought.
The gateway is a stable endpoint. Your AI connects once to localhost:39400. Behind it:
REST API capabilities (YAML files) are hot-reloaded automatically -- add, modify, or remove a file and it's live in ~500ms. Zero downtime.
MCP server backends (stdio/HTTP/SSE) require a gateway restart -- but the gateway starts in ~8ms. Your AI session stays connected.
Experiment with new tools, troubleshoot broken ones, swap configurations -- without ever losing context.
4. Production Resilience
One flaky MCP server shouldn't take down your entire toolchain.
Failsafe | What It Does |
Circuit Breaker | Isolates failing backends after 5 errors, auto-recovers after 30s |
Retry with Backoff | 3 attempts with exponential backoff for transient failures |
Rate Limiting | Per-backend throttling prevents quota exhaustion |
Health Checks | Continuous monitoring with automatic recovery detection |
Graceful Shutdown | Clean connection teardown, no orphaned processes |
Concurrency Limits | Prevent backend overload under burst traffic |
Architecture
┌─────────────────────────────────────────────────────────────────┐
│ MCP Gateway (:39400) │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Meta-MCP Mode: 4 Tools → Access 100+ Tools Dynamically │ │
│ │ • gateway_list_servers • gateway_search_tools │ │
│ │ • gateway_list_tools • gateway_invoke │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Failsafes: Circuit Breaker │ Retry │ Rate Limit │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌────────────────────┼────────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Tavily │ │ Context7 │ │ Pieces │ │
│ │ (stdio) │ │ (http) │ │ (sse) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────────────────────────────────────────────┘How It Works
The gateway exposes 4 meta-tools that replace hundreds of individual tool definitions:
Meta-Tool | Purpose |
| List available backends |
| List tools from a specific backend |
| Search tools by keyword across all backends |
| Invoke any tool on any backend |
Example: You have 12 MCP servers exposing 180+ tools. Without a gateway, every request carries all 180 definitions (~27,000 tokens). With the gateway, the AI discovers and invokes tools on demand:
Step 1: Search for relevant tools
{
"method": "tools/call",
"params": {
"name": "gateway_search_tools",
"arguments": { "query": "search" }
}
}Response:
{
"query": "search",
"matches": [
{ "server": "tavily", "tool": "tavily_search", "description": "Web search via Tavily API" },
{ "server": "brave", "tool": "brave_web_search", "description": "Search the web with Brave" },
{ "server": "github", "tool": "search_code", "description": "Search code across repositories" }
],
"total": 3
}Step 2: Invoke the tool you need
{
"method": "tools/call",
"params": {
"name": "gateway_invoke",
"arguments": {
"server": "tavily",
"tool": "tavily_search",
"arguments": { "query": "MCP protocol specification" }
}
}
}The gateway routes the call to the Tavily backend, applies circuit breaker/retry logic, and returns the result. The AI never loaded all 180 tool schemas -- it discovered and used exactly the one it needed.
Features
Authentication
Protect your gateway with bearer tokens and/or API keys:
auth:
enabled: true
# Simple bearer token (good for single-user/dev)
bearer_token: "auto" # auto-generates, or use env:VAR_NAME, or literal
# API keys for multi-client access
api_keys:
- key: "env:CLIENT_A_KEY"
name: "Client A"
rate_limit: 100 # requests per minute (0 = unlimited)
backends: ["tavily"] # restrict to specific backends (empty = all)
- key: "my-literal-key"
name: "Client B"
backends: [] # all backends
# Paths that bypass auth (always includes /health)
public_paths: ["/health", "/metrics"]Token Options:
"auto"- Auto-generate random token (logged at startup)"env:VAR_NAME"- Read from environment variable"literal-value"- Use literal string
Usage:
curl -H "Authorization: Bearer YOUR_TOKEN" http://localhost:39400/mcpPer-Client Tool Scopes
Control which tools each API key can access using allowed_tools (allowlist) and denied_tools (blocklist):
auth:
enabled: true
api_keys:
# Frontend app: Only search and read operations
- key: "env:FRONTEND_KEY"
name: "Frontend App"
allowed_tools:
- "search_*" # All search tools
- "read_*" # All read tools
- "tavily:*" # All tools on tavily server
# When allowed_tools is set, ONLY these tools are accessible
# Bot: No filesystem or execution tools
- key: "env:BOT_KEY"
name: "Bot"
denied_tools:
- "filesystem_*" # Block all filesystem tools
- "exec_*" # Block all execution tools
# When denied_tools is set, these are blocked ON TOP of global policy
# Data reader: Complex restrictions
- key: "env:READER_KEY"
name: "Data Reader"
allowed_tools:
- "database_*"
- "filesystem_*"
denied_tools:
- "database_write" # But no writes
- "database_delete" # Or deletesPattern Matching:
"exact_name"- Exact match only"prefix_*"- Glob pattern (matches anything starting with prefix)"server:tool"- Qualified name (server-specific)"server:*"- All tools on specific server
Precedence Rules:
Global tool policy denies (always enforced)
Per-client
allowed_tools(if set, ONLY these tools allowed)Per-client
denied_tools(if set, these tools blocked)Global policy default action
Security Best Practices:
Use allowlists for untrusted clients (frontend apps, bots)
Use denylists for additional restrictions on trusted clients
Always keep global policy enabled (
security.tool_policy.enabled: true)Use qualified names (
server:tool) for fine-grained control
See examples/per-client-tool-scopes.yaml for complete examples.
Protocol Support
MCP Version: 2025-11-25 (latest)
Transports: stdio, Streamable HTTP, SSE
JSON-RPC 2.0: Full compliance
Supported Backends
MCP Gateway supports all three MCP transport types. Any MCP-compliant server works -- here are common examples:
Transport | Backend | Config Example |
stdio |
|
|
stdio |
|
|
stdio |
|
|
stdio |
|
|
stdio |
|
|
HTTP | Any Streamable HTTP server |
|
SSE | Pieces, LangChain, etc. |
|
stdio backends are spawned as child processes. HTTP and SSE backends connect to already-running servers. Set env: for API keys, headers: for auth tokens, and cwd: for working directories -- see the Configuration Reference below.
Quick Start
Installation
Homebrew (macOS/Linux):
brew tap MikkoParkkola/tap
brew install mcp-gatewayCargo:
cargo install mcp-gatewayDocker:
docker run -v /path/to/servers.yaml:/config.yaml \
ghcr.io/mikkoparkkola/mcp-gateway:latest \
--config /config.yamlBinary (from GitHub Releases):
# macOS ARM64
curl -L https://github.com/MikkoParkkola/mcp-gateway/releases/latest/download/mcp-gateway-darwin-arm64 -o mcp-gateway
chmod +x mcp-gatewayUsage
# Start with configuration file
mcp-gateway --config servers.yaml
# Override port
mcp-gateway --config servers.yaml --port 8080
# Debug logging
mcp-gateway --config servers.yaml --log-level debugConfiguration
Create servers.yaml:
server:
port: 39400
meta_mcp:
enabled: true
failsafe:
circuit_breaker:
enabled: true
failure_threshold: 5
retry:
enabled: true
max_attempts: 3
backends:
tavily:
command: "npx -y @anthropic/mcp-server-tavily"
description: "Web search"
env:
TAVILY_API_KEY: "${TAVILY_API_KEY}"
context7:
http_url: "http://localhost:8080/mcp"
description: "Documentation lookup"Client Configuration
Point your MCP client to the gateway:
{
"mcpServers": {
"gateway": {
"type": "http",
"url": "http://localhost:39400/mcp"
}
}
}API Reference
Endpoints
Endpoint | Method | Description |
| GET | Health check with backend status |
| POST | Meta-MCP mode (dynamic discovery) |
| POST | Direct backend access |
Health Check Response
{
"status": "healthy",
"version": "2.0.0",
"backends": {
"tavily": {
"name": "tavily",
"running": true,
"transport": "stdio",
"tools_cached": 3,
"circuit_state": "Closed",
"request_count": 42
}
}
}Configuration Reference
Server
Field | Type | Default | Description |
| string |
| Bind address |
| u16 |
| Listen port |
| duration |
| Request timeout |
| duration |
| Graceful shutdown timeout |
Meta-MCP
Field | Type | Default | Description |
| bool |
| Enable Meta-MCP mode |
| bool |
| Cache tool lists |
| duration |
| Cache TTL |
Failsafe
Field | Type | Default | Description |
| bool |
| Enable circuit breaker |
| u32 |
| Failures before open |
| u32 |
| Successes to close |
| duration |
| Half-open delay |
| bool |
| Enable retries |
| u32 |
| Max retry attempts |
| duration |
| Initial backoff |
| duration |
| Max backoff |
| bool |
| Enable rate limiting |
| u32 |
| RPS per backend |
Backend
Field | Type | Required | Description |
| string | * | Stdio command |
| string | * | HTTP/SSE URL |
| string | Human description | |
| bool | Default: true | |
| duration | Request timeout | |
| duration | Hibernation delay | |
| map | Environment variables | |
| map | HTTP headers | |
| string | Working directory |
*One of command or http_url required
Environment Variables
Variable | Description |
| Config file path |
| Override port |
| Override host |
| Log level |
|
|
Metrics
With --features metrics:
curl http://localhost:39400/metricsExposes Prometheus metrics for:
Request count/latency per backend
Circuit breaker state changes
Rate limiter rejections
Active connections
Performance
MCP Gateway is a local Rust proxy -- it adds minimal overhead between your client and backends.
Metric | Value | Notes |
Startup time | ~8ms | Measured with |
Binary size | 7.1 MB | Release build with LTO, stripped |
Gateway overhead | <2ms per request | Local routing + JSON-RPC parsing (does not include backend latency) |
Memory | Low | Async I/O via tokio; no per-request allocations for routing |
The gateway overhead is the time spent inside the proxy itself (request parsing, backend lookup, failsafe checks, response forwarding). Actual end-to-end latency depends on the backend -- a stdio subprocess adds ~10-50ms for process I/O, while an HTTP backend adds only network round-trip time.
For detailed benchmarks, see docs/BENCHMARKS.md.
Troubleshooting
Backend won't connect
stdio backend fails to start:
# Test the command directly to verify it works
npx -y @anthropic/mcp-server-tavily
# Check the gateway logs for the actual error
mcp-gateway --config servers.yaml --log-level debugCommon causes: missing npx/node, missing API key environment variable, or incorrect command path.
HTTP/SSE backend unreachable:
Verify the backend server is running and listening on the configured URL.
Check that
http_urlincludes the full path (e.g.,http://localhost:8080/mcp, not justhttp://localhost:8080).If the backend requires auth, set
headers:in the backend config.
Circuit breaker is open
When a backend fails 5 times consecutively, the circuit breaker opens and rejects requests for 30 seconds. Check the health endpoint to see circuit state:
curl http://localhost:39400/health | jq '.backends'To adjust thresholds:
failsafe:
circuit_breaker:
failure_threshold: 10 # more tolerance before opening
reset_timeout: "15s" # shorter recovery windowDebugging requests
Enable debug logging to see every request routed through the gateway:
mcp-gateway --config servers.yaml --log-level debugThis shows: backend selection, tool invocations, circuit breaker state changes, retry attempts, and rate limiter decisions.
Tools not appearing in search
Verify the backend is running: check
gateway_list_serversoutput.Tool lists are cached (default 5 minutes). Restart the gateway or wait for cache expiry after adding new backends.
Confirm the backend responds to
tools/list-- some servers require initialization first.
Building
git clone https://github.com/MikkoParkkola/mcp-gateway
cd mcp-gateway
cargo build --releaseContributing
Contributions are welcome. The short version:
Fork and branch --
git checkout -b feature/your-featureTest --
cargo test(all tests must pass)Lint --
cargo fmt && cargo clippy -- -D warningsPR -- open a pull request against
mainwith a clear descriptionChangelog -- add an entry to CHANGELOG.md for user-facing changes
Look for issues labeled good first issue or help wanted to get started. For larger changes, open an issue first to discuss the approach.
Full details: CONTRIBUTING.md
License
MIT License - see LICENSE for details.
Credits
Created by Mikko Parkkola
Implements Model Context Protocol version 2025-11-25.