Routes queries to Google's Gemini models (Gemini 3 Pro, Gemini 3 Flash, Gemini 2.0 Pro, Gemini 2.0 Flash) with support for large context windows up to 2M tokens.
Routes queries to OpenAI's models (GPT-5.2, o3, o3-mini, GPT-4o, GPT-4o-mini) with intelligent selection based on task type, reasoning requirements, and routing strategy.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP Routeroptimize model selection for refactoring this authentication system"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
๐ MCP Router
Intelligent Model Context Protocol Router for Cursor IDE
Automatically selects the optimal LLM model for each task based on query analysis, complexity, and your preferred strategy.
๐ System Architecture
Data Flow
โจ Features
Feature | Description |
๐ค Intelligent Routing | Automatically selects the best model based on query analysis |
๐ง Context-Aware Routing | Uses chat history and conversation context for smarter model selection |
๐ 4 Routing Strategies |
|
๐ Query Analysis | Detects task type, complexity, and special requirements |
๐ฌ Chat History Analysis | Analyzes conversation patterns, topics, files, languages, and complexity |
๐ฐ Cost Estimation | Estimates costs before execution |
โก 17 Models | Latest 2025 models from OpenAI, Anthropic, Google, Cursor, DeepSeek |
๐ง Cursor Native | Zero API keys needed - Cursor handles execution |
๐ Supported Models (2025)
Tier 1: Flagship Models (Complex Architecture & Refactoring)
Model | Provider | Context | Cost (in/out) | Quality |
GPT-5.2 | OpenAI | 256K | $5.00/$15.00 | 0.99/0.98 |
Claude 4.5 Opus | Anthropic | 200K | $25.00/$75.00 | 0.99/0.99 |
Claude 4.5 Sonnet | Anthropic | 200K | $5.00/$25.00 | 0.97/0.98 |
Tier 2: Reasoning Models (Chain of Thought)
Model | Provider | Context | Cost (in/out) | Quality |
o3 | OpenAI | 200K | $10.00/$40.00 | 0.99/0.95 |
o3-mini (High) | OpenAI | 128K | $1.50/$6.00 | 0.95/0.92 |
Claude 3.7 Sonnet | Anthropic | 200K | $4.00/$20.00 | 0.96/0.96 |
Tier 3: Native & Fast Models
Model | Provider | Context | Cost (in/out) | Quality |
Composer 1 | Cursor | 128K | $0.10/$0.30 | 0.88/0.92 |
Gemini 3 Pro | 2M | $2.00/$8.00 | 0.96/0.94 | |
Gemini 3 Flash | 1M | $0.10/$0.40 | 0.88/0.90 |
Tier 4: Budget/Legacy Models
Model | Provider | Context | Quality |
GPT-4o / GPT-4o-mini | OpenAI | 128K | 0.95/0.85 |
Claude 3.5 Sonnet/Haiku | Anthropic | 200K | 0.96/0.88 |
Gemini 2.0 Pro/Flash | 2M/1M | 0.94/0.85 | |
DeepSeek V3 | DeepSeek | 128K | 0.92/0.94 |
DeepSeek R1 | DeepSeek | 128K | 0.96/0.92 |
๐ Quick Start
1. Install
2. Configure Cursor
Add to ~/.cursor/mcp.json:
Note: No API keys needed! Cursor handles all API calls with its own keys.
3. Restart Cursor
The MCP router will appear in your agent tools. Use it with:
@mcp-router get_model_recommendation "your task description"@mcp-router analyze_query "your query"@mcp-router list_models
๐ป CLI Usage
Example Output
๐ฏ Routing Strategies
Strategy | Description | Best For |
| Optimizes for cost, speed, and quality equally | General use |
| Prioritizes highest capability models | Complex tasks, refactoring |
| Prioritizes fastest response time | Quick edits, simple tasks |
| Prioritizes cheapest models | Budget-conscious usage |
๐ Python API
๐ Project Structure
๐ง Adding Custom Models
๐ฎ Cursor Commands
Create .cursor/commands/route.md:
๐ MCP Tools Available
Tool | Description |
| Route a query and get model recommendation (supports chat_history) |
| Get recommendation without execution (supports chat_history) |
| Analyze chat history text to extract routing signals |
| List all 17 registered models |
| Get usage statistics |
| Analyze query characteristics |
Context-Aware Routing with Chat History
The router can now analyze chat history to make smarter routing decisions:
The router analyzes chat history to detect:
Context depth: Shallow/medium/deep based on token count
Dominant task type: Code generation, editing, debugging, etc.
Programming languages: Detects Python, JavaScript, Rust, etc.
Files mentioned: Tracks files being worked on
Error patterns: Identifies debugging sessions
Topics: Authentication, database, API, testing, etc.
Complexity: Based on files, languages, and conversation depth
These signals influence model selection:
Deep context โ Models with larger context windows
Debugging sessions โ High-reasoning models
Multi-file tasks โ Code-focused models
Multiple languages โ Polyglot-capable models
๐ค Contributing
Fork the repository
Create a feature branch
Make your changes
Submit a pull request
๐ License
MIT License - see LICENSE for details.