Routes queries to Google's Gemini models (Gemini 3 Pro, Gemini 3 Flash, Gemini 2.0 Pro, Gemini 2.0 Flash) with support for large context windows up to 2M tokens.
Routes queries to OpenAI's models (GPT-5.2, o3, o3-mini, GPT-4o, GPT-4o-mini) with intelligent selection based on task type, reasoning requirements, and routing strategy.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP Routeroptimize model selection for refactoring this authentication system"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
š MCP Router
Intelligent Model Context Protocol Router for Cursor IDE
Automatically selects the optimal LLM model for each task based on query analysis, complexity, and your preferred strategy.
š System Architecture
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā
ā CURSOR IDE ā
ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā
ā ā User Query ā ā
ā ā "Refactor this authentication system across multiple files" ā ā
ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā
ā ā ā
ā ā¼ ā
ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā
ā ā MCP Router Server ā ā
ā ā āāāāāāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāāāāā ā ā
ā ā ā Query Analyzer āāāāā¶ā Model Scorer āāāā¶ā Routing Decision ā ā ā
ā ā ā ā ā ā ā ā ā ā
ā ā ā ⢠Task Type ā ā ⢠Quality Score ā ā ⢠Selected Model ā ā ā
ā ā ā ⢠Complexity ā ā ⢠Cost Score ā ā ⢠Confidence ā ā ā
ā ā ā ⢠Requirements ā ā ⢠Speed Score ā ā ⢠Reasoning ā ā ā
ā ā ā ⢠Token Estimate ā ā ⢠Strategy Weight ā ā ⢠Alternatives ā ā ā
ā ā āāāāāāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāāāāā ā ā
ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā
ā ā ā
ā ā¼ ā
ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā
ā ā Model Registry (17 Models) ā ā
ā ā ā ā
ā ā āāāāāāāāāāāāāāā āāāāāāāāāāāāāāā āāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāāāā ā ā
ā ā ā FLAGSHIP ā ā REASONING ā ā NATIVE/FAST ā ā BUDGET/LEGACY ā ā ā
ā ā ā ā ā ā ā ā ā ā ā ā
ā ā ā ⢠GPT-5.2 ā ā ⢠o3 ā ā ⢠Composer1 ā ā ⢠GPT-4o-mini ā ā ā
ā ā ā ⢠Claude4.5 ā ā ⢠o3-mini ā ā ⢠Gemini 3 ā ā ⢠Claude Haiku ā ā ā
ā ā ā Opus ā ā ⢠Claude3.7 ā ā Pro/Flash ā ā ⢠DeepSeek V3 ā ā ā
ā ā ā ⢠Claude4.5 ā ā Sonnet ā ā ā ā ⢠DeepSeek R1 ā ā ā
ā ā ā Sonnet ā ā ā ā ā ā ā ā ā
ā ā āāāāāāāāāāāāāāā āāāāāāāāāāāāāāā āāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāāāā ā ā
ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā¬āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā
ā ā ā
ā ā¼ ā
ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā
ā ā Cursor Executes Query ā ā
ā ā (Using its own API keys for selected model) ā ā
ā āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāā ā
āāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāāData Flow
āāāāāāāāāāāā āāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāā āāāāāāāāāāāāāā
ā Query āāāāāāā¶ā Analyze āāāāāāā¶ā Score āāāāāāā¶ā Recommend ā
āāāāāāāāāāāā āāāāāāāāāāāāāāāā āāāāāāāāāāāāāāāāā āāāāāāāāāāāāāā
ā ā ā
ā¼ ā¼ ā¼
āāāāāāāāāāāāāāā āāāāāāāāāāāāāāā āāāāāāāāāāāāāāā
ā Task Type: ā ā Apply ā ā Model: ā
ā ⢠reasoning ā ā Strategy: ā ā Claude 4.5 ā
ā ⢠code_gen ā ā ⢠balanced ā ā Sonnet ā
ā ⢠edit ā ā ⢠quality ā ā ā
ā Complexity: ā ā ⢠speed ā ā Confidence: ā
ā ⢠medium ā ā ⢠cost ā ā 88.45% ā
āāāāāāāāāāāāāāā āāāāāāāāāāāāāāā āāāāāāāāāāāāāāā⨠Features
Feature | Description |
š¤ Intelligent Routing | Automatically selects the best model based on query analysis |
š§ Context-Aware Routing | Uses chat history and conversation context for smarter model selection |
š 4 Routing Strategies |
|
š Query Analysis | Detects task type, complexity, and special requirements |
š¬ Chat History Analysis | Analyzes conversation patterns, topics, files, languages, and complexity |
š° Cost Estimation | Estimates costs before execution |
ā” 17 Models | Latest 2025 models from OpenAI, Anthropic, Google, Cursor, DeepSeek |
š§ Cursor Native | Zero API keys needed - Cursor handles execution |
š Supported Models (2025)
Tier 1: Flagship Models (Complex Architecture & Refactoring)
Model | Provider | Context | Cost (in/out) | Quality |
GPT-5.2 | OpenAI | 256K | $5.00/$15.00 | 0.99/0.98 |
Claude 4.5 Opus | Anthropic | 200K | $25.00/$75.00 | 0.99/0.99 |
Claude 4.5 Sonnet | Anthropic | 200K | $5.00/$25.00 | 0.97/0.98 |
Tier 2: Reasoning Models (Chain of Thought)
Model | Provider | Context | Cost (in/out) | Quality |
o3 | OpenAI | 200K | $10.00/$40.00 | 0.99/0.95 |
o3-mini (High) | OpenAI | 128K | $1.50/$6.00 | 0.95/0.92 |
Claude 3.7 Sonnet | Anthropic | 200K | $4.00/$20.00 | 0.96/0.96 |
Tier 3: Native & Fast Models
Model | Provider | Context | Cost (in/out) | Quality |
Composer 1 | Cursor | 128K | $0.10/$0.30 | 0.88/0.92 |
Gemini 3 Pro | 2M | $2.00/$8.00 | 0.96/0.94 | |
Gemini 3 Flash | 1M | $0.10/$0.40 | 0.88/0.90 |
Tier 4: Budget/Legacy Models
Model | Provider | Context | Quality |
GPT-4o / GPT-4o-mini | OpenAI | 128K | 0.95/0.85 |
Claude 3.5 Sonnet/Haiku | Anthropic | 200K | 0.96/0.88 |
Gemini 2.0 Pro/Flash | 2M/1M | 0.94/0.85 | |
DeepSeek V3 | DeepSeek | 128K | 0.92/0.94 |
DeepSeek R1 | DeepSeek | 128K | 0.96/0.92 |
š Quick Start
1. Install
git clone https://github.com/AI-Castle-Labs/mcp-router.git
cd mcp-router
pip install -r requirements.txt
pip install mcp # MCP SDK for Cursor integration2. Configure Cursor
Add to ~/.cursor/mcp.json:
{
"version": "1.0",
"mcpServers": {
"mcp-router": {
"command": "python3",
"args": ["/path/to/mcp-router/src/mcp_server.py"],
"env": {}
}
}
}Note: No API keys needed! Cursor handles all API calls with its own keys.
3. Restart Cursor
The MCP router will appear in your agent tools. Use it with:
@mcp-router get_model_recommendation "your task description"@mcp-router analyze_query "your query"@mcp-router list_models
š» CLI Usage
# Route a query (shows which model would be selected)
python main.py route "Explain how neural networks work"
# Route with strategy
python main.py route "Refactor this codebase" --strategy quality
# List all registered models
python main.py list
# Show routing statistics
python main.py statsExample Output
============================================================
Routing Decision
============================================================
Query: Refactor this complex authentication system...
Selected Model: Claude 4.5 Sonnet
Model ID: claude-4.5-sonnet
Provider: anthropic
Confidence: 88.45%
Reasoning: Model is optimized for code_edit tasks; Selected for highest quality
Alternatives:
- Composer 1 (composer-1)
- Claude 3.5 Haiku (claude-3-5-haiku-20241022)
- GPT-4o-mini (gpt-4o-mini)šÆ Routing Strategies
Strategy | Description | Best For |
| Optimizes for cost, speed, and quality equally | General use |
| Prioritizes highest capability models | Complex tasks, refactoring |
| Prioritizes fastest response time | Quick edits, simple tasks |
| Prioritizes cheapest models | Budget-conscious usage |
š Python API
from src.router import MCPRouter
# Initialize router (loads 17 default models)
router = MCPRouter()
# Route a query
decision = router.route(
"Analyze this codebase architecture",
strategy="quality"
)
print(f"Selected: {decision.selected_model.name}")
print(f"Model ID: {decision.selected_model.model_id}")
print(f"Confidence: {decision.confidence:.1%}")
print(f"Reasoning: {decision.reasoning}")
# Get alternatives
for alt in decision.alternatives[:3]:
print(f" Alternative: {alt.name}")š Project Structure
mcp-router/
āāā src/
ā āāā router.py # Core routing logic + 17 model definitions
ā āāā mcp_server.py # MCP server for Cursor integration
ā āāā client.py # API client for model execution
ā āāā cursor_wrapper.py # Cursor-specific utilities
āāā config/
ā āāā cursor_mcp_config.json # Template for Cursor config
āāā scripts/
ā āāā setup_cursor.sh # Automated setup script
āāā docs/
ā āāā cursor_integration.md
ā āāā QUICKSTART_CURSOR.md
ā āāā AGENT_SETTINGS.md
āāā main.py # CLI entry point
āāā requirements.txt
āāā README.mdš§ Adding Custom Models
from src.router import MCPRouter, ModelCapabilities, TaskType
router = MCPRouter()
router.register_model(ModelCapabilities(
name="My Custom Model",
provider="custom",
model_id="custom-model-v1",
supports_reasoning=True,
supports_code=True,
supports_streaming=True,
max_tokens=8192,
context_window=32000,
cost_per_1k_tokens_input=1.0,
cost_per_1k_tokens_output=2.0,
avg_latency_ms=600,
reasoning_quality=0.85,
code_quality=0.90,
speed_score=0.80,
preferred_tasks=[TaskType.CODE_GENERATION],
api_key_env_var="CUSTOM_API_KEY"
))š® Cursor Commands
Create .cursor/commands/route.md:
---
description: "Get model recommendation from MCP router for the current task"
---
Use the MCP router to determine the best model for the task at hand.
1. Analyze the current context
2. Call `@mcp-router get_model_recommendation` with task description
3. Present the recommendation with confidence and alternatives
4. Suggest switching models if neededš MCP Tools Available
Tool | Description |
| Route a query and get model recommendation (supports chat_history) |
| Get recommendation without execution (supports chat_history) |
| Analyze chat history text to extract routing signals |
| List all 17 registered models |
| Get usage statistics |
| Analyze query characteristics |
Context-Aware Routing with Chat History
The router can now analyze chat history to make smarter routing decisions:
// Example: Using chat history for context-aware routing
{
"query": "Fix the authentication bug we discussed",
"strategy": "quality",
"chat_history": [
{
"role": "user",
"content": "I'm working on auth.py and users can't log in",
"timestamp": 1704067200
},
{
"role": "assistant",
"content": "Let me check the authentication flow...",
"timestamp": 1704067205
}
]
}The router analyzes chat history to detect:
Context depth: Shallow/medium/deep based on token count
Dominant task type: Code generation, editing, debugging, etc.
Programming languages: Detects Python, JavaScript, Rust, etc.
Files mentioned: Tracks files being worked on
Error patterns: Identifies debugging sessions
Topics: Authentication, database, API, testing, etc.
Complexity: Based on files, languages, and conversation depth
These signals influence model selection:
Deep context ā Models with larger context windows
Debugging sessions ā High-reasoning models
Multi-file tasks ā Code-focused models
Multiple languages ā Polyglot-capable models
š¤ Contributing
Fork the repository
Create a feature branch
Make your changes
Submit a pull request
š License
MIT License - see LICENSE for details.
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.