Routes queries to Google's Gemini models (Gemini 3 Pro, Gemini 3 Flash, Gemini 2.0 Pro, Gemini 2.0 Flash) with support for large context windows up to 2M tokens.
Routes queries to OpenAI's models (GPT-5.2, o3, o3-mini, GPT-4o, GPT-4o-mini) with intelligent selection based on task type, reasoning requirements, and routing strategy.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP Routeroptimize model selection for refactoring this authentication system"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
๐ MCP Router
Intelligent Model Context Protocol Router for Cursor IDE
Automatically selects the optimal LLM model for each task based on query analysis, complexity, and your preferred strategy.
๐ System Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ CURSOR IDE โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ User Query โ โ
โ โ "Refactor this authentication system across multiple files" โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ MCP Router Server โ โ
โ โ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ Query Analyzer โโโโโถโ Model Scorer โโโโถโ Routing Decision โ โ โ
โ โ โ โ โ โ โ โ โ โ
โ โ โ โข Task Type โ โ โข Quality Score โ โ โข Selected Model โ โ โ
โ โ โ โข Complexity โ โ โข Cost Score โ โ โข Confidence โ โ โ
โ โ โ โข Requirements โ โ โข Speed Score โ โ โข Reasoning โ โ โ
โ โ โ โข Token Estimate โ โ โข Strategy Weight โ โ โข Alternatives โ โ โ
โ โ โโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Model Registry (17 Models) โ โ
โ โ โ โ
โ โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โ โ
โ โ โ FLAGSHIP โ โ REASONING โ โ NATIVE/FAST โ โ BUDGET/LEGACY โ โ โ
โ โ โ โ โ โ โ โ โ โ โ โ
โ โ โ โข GPT-5.2 โ โ โข o3 โ โ โข Composer1 โ โ โข GPT-4o-mini โ โ โ
โ โ โ โข Claude4.5 โ โ โข o3-mini โ โ โข Gemini 3 โ โ โข Claude Haiku โ โ โ
โ โ โ Opus โ โ โข Claude3.7 โ โ Pro/Flash โ โ โข DeepSeek V3 โ โ โ
โ โ โ โข Claude4.5 โ โ Sonnet โ โ โ โ โข DeepSeek R1 โ โ โ
โ โ โ Sonnet โ โ โ โ โ โ โ โ โ
โ โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโโ โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Cursor Executes Query โ โ
โ โ (Using its own API keys for selected model) โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโData Flow
โโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
โ Query โโโโโโโถโ Analyze โโโโโโโถโ Score โโโโโโโถโ Recommend โ
โโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโ
โ โ โ
โผ โผ โผ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ
โ Task Type: โ โ Apply โ โ Model: โ
โ โข reasoning โ โ Strategy: โ โ Claude 4.5 โ
โ โข code_gen โ โ โข balanced โ โ Sonnet โ
โ โข edit โ โ โข quality โ โ โ
โ Complexity: โ โ โข speed โ โ Confidence: โ
โ โข medium โ โ โข cost โ โ 88.45% โ
โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโจ Features
Feature | Description |
๐ค Intelligent Routing | Automatically selects the best model based on query analysis |
๐ง Context-Aware Routing | Uses chat history and conversation context for smarter model selection |
๐ 4 Routing Strategies |
|
๐ Query Analysis | Detects task type, complexity, and special requirements |
๐ฌ Chat History Analysis | Analyzes conversation patterns, topics, files, languages, and complexity |
๐ฐ Cost Estimation | Estimates costs before execution |
โก 17 Models | Latest 2025 models from OpenAI, Anthropic, Google, Cursor, DeepSeek |
๐ง Cursor Native | Zero API keys needed - Cursor handles execution |
๐ Supported Models (2025)
Tier 1: Flagship Models (Complex Architecture & Refactoring)
Model | Provider | Context | Cost (in/out) | Quality |
GPT-5.2 | OpenAI | 256K | $5.00/$15.00 | 0.99/0.98 |
Claude 4.5 Opus | Anthropic | 200K | $25.00/$75.00 | 0.99/0.99 |
Claude 4.5 Sonnet | Anthropic | 200K | $5.00/$25.00 | 0.97/0.98 |
Tier 2: Reasoning Models (Chain of Thought)
Model | Provider | Context | Cost (in/out) | Quality |
o3 | OpenAI | 200K | $10.00/$40.00 | 0.99/0.95 |
o3-mini (High) | OpenAI | 128K | $1.50/$6.00 | 0.95/0.92 |
Claude 3.7 Sonnet | Anthropic | 200K | $4.00/$20.00 | 0.96/0.96 |
Tier 3: Native & Fast Models
Model | Provider | Context | Cost (in/out) | Quality |
Composer 1 | Cursor | 128K | $0.10/$0.30 | 0.88/0.92 |
Gemini 3 Pro | 2M | $2.00/$8.00 | 0.96/0.94 | |
Gemini 3 Flash | 1M | $0.10/$0.40 | 0.88/0.90 |
Tier 4: Budget/Legacy Models
Model | Provider | Context | Quality |
GPT-4o / GPT-4o-mini | OpenAI | 128K | 0.95/0.85 |
Claude 3.5 Sonnet/Haiku | Anthropic | 200K | 0.96/0.88 |
Gemini 2.0 Pro/Flash | 2M/1M | 0.94/0.85 | |
DeepSeek V3 | DeepSeek | 128K | 0.92/0.94 |
DeepSeek R1 | DeepSeek | 128K | 0.96/0.92 |
๐ Quick Start
1. Install
git clone https://github.com/AI-Castle-Labs/mcp-router.git
cd mcp-router
pip install -r requirements.txt
pip install mcp # MCP SDK for Cursor integration2. Configure Cursor
Add to ~/.cursor/mcp.json:
{
"version": "1.0",
"mcpServers": {
"mcp-router": {
"command": "python3",
"args": ["/path/to/mcp-router/src/mcp_server.py"],
"env": {}
}
}
}Note: No API keys needed! Cursor handles all API calls with its own keys.
3. Restart Cursor
The MCP router will appear in your agent tools. Use it with:
@mcp-router get_model_recommendation "your task description"@mcp-router analyze_query "your query"@mcp-router list_models
๐ป CLI Usage
# Route a query (shows which model would be selected)
python main.py route "Explain how neural networks work"
# Route with strategy
python main.py route "Refactor this codebase" --strategy quality
# List all registered models
python main.py list
# Show routing statistics
python main.py statsExample Output
============================================================
Routing Decision
============================================================
Query: Refactor this complex authentication system...
Selected Model: Claude 4.5 Sonnet
Model ID: claude-4.5-sonnet
Provider: anthropic
Confidence: 88.45%
Reasoning: Model is optimized for code_edit tasks; Selected for highest quality
Alternatives:
- Composer 1 (composer-1)
- Claude 3.5 Haiku (claude-3-5-haiku-20241022)
- GPT-4o-mini (gpt-4o-mini)๐ฏ Routing Strategies
Strategy | Description | Best For |
| Optimizes for cost, speed, and quality equally | General use |
| Prioritizes highest capability models | Complex tasks, refactoring |
| Prioritizes fastest response time | Quick edits, simple tasks |
| Prioritizes cheapest models | Budget-conscious usage |
๐ Python API
from src.router import MCPRouter
# Initialize router (loads 17 default models)
router = MCPRouter()
# Route a query
decision = router.route(
"Analyze this codebase architecture",
strategy="quality"
)
print(f"Selected: {decision.selected_model.name}")
print(f"Model ID: {decision.selected_model.model_id}")
print(f"Confidence: {decision.confidence:.1%}")
print(f"Reasoning: {decision.reasoning}")
# Get alternatives
for alt in decision.alternatives[:3]:
print(f" Alternative: {alt.name}")๐ Project Structure
mcp-router/
โโโ src/
โ โโโ router.py # Core routing logic + 17 model definitions
โ โโโ mcp_server.py # MCP server for Cursor integration
โ โโโ client.py # API client for model execution
โ โโโ cursor_wrapper.py # Cursor-specific utilities
โโโ config/
โ โโโ cursor_mcp_config.json # Template for Cursor config
โโโ scripts/
โ โโโ setup_cursor.sh # Automated setup script
โโโ docs/
โ โโโ cursor_integration.md
โ โโโ QUICKSTART_CURSOR.md
โ โโโ AGENT_SETTINGS.md
โโโ main.py # CLI entry point
โโโ requirements.txt
โโโ README.md๐ง Adding Custom Models
from src.router import MCPRouter, ModelCapabilities, TaskType
router = MCPRouter()
router.register_model(ModelCapabilities(
name="My Custom Model",
provider="custom",
model_id="custom-model-v1",
supports_reasoning=True,
supports_code=True,
supports_streaming=True,
max_tokens=8192,
context_window=32000,
cost_per_1k_tokens_input=1.0,
cost_per_1k_tokens_output=2.0,
avg_latency_ms=600,
reasoning_quality=0.85,
code_quality=0.90,
speed_score=0.80,
preferred_tasks=[TaskType.CODE_GENERATION],
api_key_env_var="CUSTOM_API_KEY"
))๐ฎ Cursor Commands
Create .cursor/commands/route.md:
---
description: "Get model recommendation from MCP router for the current task"
---
Use the MCP router to determine the best model for the task at hand.
1. Analyze the current context
2. Call `@mcp-router get_model_recommendation` with task description
3. Present the recommendation with confidence and alternatives
4. Suggest switching models if needed๐ MCP Tools Available
Tool | Description |
| Route a query and get model recommendation (supports chat_history) |
| Get recommendation without execution (supports chat_history) |
| Analyze chat history text to extract routing signals |
| List all 17 registered models |
| Get usage statistics |
| Analyze query characteristics |
Context-Aware Routing with Chat History
The router can now analyze chat history to make smarter routing decisions:
// Example: Using chat history for context-aware routing
{
"query": "Fix the authentication bug we discussed",
"strategy": "quality",
"chat_history": [
{
"role": "user",
"content": "I'm working on auth.py and users can't log in",
"timestamp": 1704067200
},
{
"role": "assistant",
"content": "Let me check the authentication flow...",
"timestamp": 1704067205
}
]
}The router analyzes chat history to detect:
Context depth: Shallow/medium/deep based on token count
Dominant task type: Code generation, editing, debugging, etc.
Programming languages: Detects Python, JavaScript, Rust, etc.
Files mentioned: Tracks files being worked on
Error patterns: Identifies debugging sessions
Topics: Authentication, database, API, testing, etc.
Complexity: Based on files, languages, and conversation depth
These signals influence model selection:
Deep context โ Models with larger context windows
Debugging sessions โ High-reasoning models
Multi-file tasks โ Code-focused models
Multiple languages โ Polyglot-capable models
๐ค Contributing
Fork the repository
Create a feature branch
Make your changes
Submit a pull request
๐ License
MIT License - see LICENSE for details.