Integrates Google's Gemini models (gemini-2.0-flash-exp, gemini-1.5-pro) for AI generation with automatic model selection based on task complexity
Supports local AI models through Ollama integration with custom API endpoints for self-hosted AI generation
Provides access to OpenAI models including GPT-4o, GPT-4o-mini, O1, and O3-mini with intelligent model selection and cost control features
Uses Redis for conversation memory management and persistent storage of conversation context across multi-turn interactions
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@SAGE-MCPanalyze this Python function for performance issues and suggest optimizations"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
🧙 SAGE-MCP: Simple AI Guidance Engine for Claude
Universal AI assistant MCP server with intelligent mode selection, conversation continuity, and smart file handling
SAGE-MCP transforms Claude into a multi-talented development assistant that adapts to your needs. Whether you're debugging code, planning architecture, writing tests, or having a technical discussion, SAGE automatically selects the right approach and model for optimal results.
✨ Key Features
🎯 Intelligent Mode System
chat - Natural conversations with context awareness
analyze - Deep code analysis and pattern recognition
review - Comprehensive code reviews with actionable feedback
debug - Systematic debugging and root cause analysis
plan - Strategic project planning and architecture design
test - Test generation with coverage analysis
refactor - Code improvement and modernization
think - Deep reasoning with adjustable thinking depth
🔄 Conversation Continuity
Seamless multi-turn conversations across different modes
Automatic context preservation between tool calls
Smart file deduplication - never re-read the same files
Thread-based memory system for long-running tasks
🤖 Smart Model Selection
Auto mode - Intelligent model selection based on task complexity
Support for multiple providers: OpenAI, Anthropic, Google, OpenRouter
Model restrictions via environment variables for cost control
Thinking depth control: minimal (0.5%), low (8%), medium (33%), high (67%), max (100%)
📁 Intelligent File Handling
embedded - Full file content in context (default)
summary - Token-efficient summaries for large codebases
reference - File storage with ID references
output_file - Save output directly to disk (no context pollution)
Automatic directory expansion and smart deduplication
Security validation for all file operations
🌐 Web Search Integration
Real-time documentation lookup
Best practices and current standards
Framework and library research
Error and issue investigation
Related MCP server: Terminal MCP Server
🎨 Mode Specializations
Mode | Temperature | Description | Best For |
chat | 0.5 | Natural conversations with balanced creativity | Q&A, brainstorming, explanations |
analyze | 0.2 | Focused precision for code analysis | Architecture review, pattern detection |
review | 0.3 | Systematic evaluation with consistent standards | Security audits, best practices |
debug | 0.1 | Deterministic analysis for troubleshooting | Error investigation, root cause analysis |
plan | 0.4 | Strategic thinking for project planning | Architecture design, task breakdown |
test | 0.2 | Accurate test generation with edge cases | Unit tests, integration tests |
refactor | 0.3 | Careful improvements preserving functionality | Code modernization, optimization |
think | 0.7 | Creative problem solving with deep reasoning | Complex algorithms, system design |
🚀 Quick Start
Installation
# Clone the repository
git clone https://github.com/david-strejc/sage-mcp
cd sage-mcp
# Install dependencies
pip install -r requirements.txt
# Configure your API keys
export OPENAI_API_KEY="your-key-here"
export ANTHROPIC_API_KEY="your-key-here"
export GOOGLE_API_KEY="your-key-here"
export OPENROUTER_API_KEY="your-key-here"Claude Desktop Configuration
Add to your Claude Desktop MCP settings:
{
"mcpServers": {
"sage": {
"command": "python",
"args": ["/path/to/sage-mcp/server.py"],
"env": {
"OPENAI_API_KEY": "your-key",
"ANTHROPIC_API_KEY": "your-key",
"DEFAULT_MODEL": "gpt-4o",
"DEFAULT_PROVIDER": "openai"
}
}
}
}📖 Usage Examples
Basic Chat
// In Claude:
Use sage tool to explain how async/await works in PythonCode Analysis with Files
// Analyze specific files
Use sage tool in analyze mode to review the architecture of ./src/api/
// With model selection
Use sage with model gpt-4o to analyze performance bottlenecks in server.pyMulti-turn Conversations
// First turn
Use sage to help me design a caching system
// Continue the conversation (Claude will auto-continue)
Now let's implement the LRU cache we discussed
// Files are automatically deduplicated across turnsDeep Thinking Mode
// For complex problems requiring deep reasoning
Use sage in think mode with thinking_mode="high" to solve this algorithmic challenge: [problem description]Smart File Handling
// Token-efficient mode for large codebases
Use sage with file_handling_mode="summary" to review the entire project structure
// Reference mode for iterative work
Use sage with file_handling_mode="reference" to start refactoring the database layer⚙️ Configuration
Environment Variables
# Provider Configuration
DEFAULT_PROVIDER=openai # Default: auto
DEFAULT_MODEL=gpt-4o # Default: auto
FALLBACK_MODEL=gpt-4o-mini # Fallback for errors
# Model Restrictions (optional)
ALLOWED_MODELS=gpt-4o,gpt-4o-mini,claude-3-5-sonnet
DISALLOWED_MODELS=o1-preview,o1 # Expensive models to exclude
# Feature Flags
WEBSEARCH_ENABLED=true # Enable web search
FILE_SECURITY_CHECK=true # Validate file paths
AUTO_MODEL_SELECTION=true # Smart model selection
# Token Limits
MAX_TOKENS_GPT4O=128000
MAX_TOKENS_CLAUDE=200000
MAX_THINKING_TOKENS_O1=100000Mode-Specific Temperatures
Default temperatures optimized for each mode:
chat: 0.5 - Balanced creativity
analyze: 0.2 - Focused precision
review: 0.3 - Systematic evaluation
debug: 0.1 - Deterministic analysis
plan: 0.4 - Strategic thinking
test: 0.2 - Accurate test generation
refactor: 0.3 - Careful improvements
think: 0.7 - Creative problem solving
🔧 Advanced Features
Conversation Continuation
# Start conversation
response = sage(mode="chat", prompt="Let's design a web app")
# Returns: continuation_id: abc123
# Continue in same mode
sage(mode="chat", prompt="What database should we use?", continuation_id="abc123")
# Switch modes seamlessly
sage(mode="analyze", prompt="Review our database schema",
files=["/db/schema.sql"], continuation_id="abc123")Smart File Handling
# Multiple modes available
sage(mode="review",
files=["/src", "/tests"], # Auto-expands directories
file_handling_mode="embedded", # Full content (default)
prompt="Security review")
sage(mode="analyze",
files=["/large/codebase"],
file_handling_mode="summary", # Summaries only (saves tokens)
prompt="Architecture overview")
sage(mode="debug",
files=["/logs"],
file_handling_mode="reference", # Store with IDs
prompt="Analyze error patterns")Save Output to File
For large outputs that would pollute context, save directly to disk:
# Save analysis directly to file instead of returning in response
sage(mode="analyze",
files=["/src"],
prompt="Full codebase analysis",
output_file="/tmp/analysis.md")
# Returns: "Output saved to /tmp/analysis.md (15.2KB)"
# Great for:
# - Large code reviews
# - Documentation generation
# - Analysis reports
# - Any output you want to process laterFeatures:
Creates parent directories automatically
Prevents accidental overwrites (file must not exist)
Blocks writes to system directories
Returns human-readable file size confirmation
Model Restrictions
# Environment variables for cost control
OPENAI_ALLOWED_MODELS=o3-mini,gpt-4o-mini
GOOGLE_ALLOWED_MODELS=gemini-2.0-flash-exp,gemini-1.5-pro
BLOCKED_MODELS=gpt-4,claude-opus
DISABLED_MODEL_PATTERNS=expensive,legacy
# Auto mode requires model selection when restricted
DEFAULT_MODEL=auto # Forces explicit model choiceSupported Models
Provider | Models | Configuration |
OpenAI | gpt-4o, gpt-4o-mini, o1, o3-mini | OPENAI_API_KEY |
Anthropic | claude-3-5-sonnet, claude-3-5-haiku | ANTHROPIC_API_KEY |
gemini-2.0-flash-exp, gemini-1.5-pro | GOOGLE_API_KEY | |
OpenRouter | 100+ models from all providers | OPENROUTER_API_KEY |
Custom/Ollama | llama3.2, mistral, codestral | CUSTOM_API_URL |
Complete Configuration Reference
Variable | Description | Example |
API Keys | ||
| OpenAI API key |
|
| Anthropic Claude API key |
|
| Google Gemini API key |
|
| OpenRouter API key |
|
| xAI (Grok) API key |
|
| Custom/Ollama API endpoint |
|
| Custom API key (if required) |
|
Model Selection | ||
| Default model ( |
|
Model Restrictions | ||
| Allowed OpenAI models |
|
| Allowed Google models |
|
| Allowed Anthropic models |
|
| Blocked models (any provider) |
|
| Disable by pattern |
|
Limits & Performance | ||
| Maximum file size in bytes |
|
| MCP transport limit |
|
| Max turns per conversation |
|
| Conversation timeout |
|
Memory & Storage | ||
| Redis connection for memory |
|
| Redis database number |
|
Temperature Overrides | ||
| Chat mode temperature |
|
| Analyze mode temperature |
|
| Debug mode temperature |
|
| Plan mode temperature |
|
| Test mode temperature |
|
| Refactor mode temperature |
|
| Review mode temperature |
|
| Think mode temperature |
|
🏗️ Architecture
sage-mcp/
├── server.py # FastMCP server entry point
├── config.py # Configuration management
├── tools/
│ └── sage.py # Universal SAGE tool
├── modes/ # Specialized AI modes
│ ├── base.py # Base mode handler
│ ├── chat.py # Conversational mode
│ ├── analyze.py # Code analysis mode
│ ├── debug.py # Debugging mode
│ └── ...
├── providers/ # AI provider integrations
│ ├── openai.py
│ ├── anthropic.py
│ ├── gemini.py
│ └── openrouter.py
├── models/ # Model management
│ ├── manager.py # Intelligent model selection
│ └── config.yaml # Model capabilities
└── utils/ # Utilities
├── files.py # File handling
├── memory.py # Conversation memory
├── models.py # Model restrictions
└── security.py # Security validation🧪 Advanced Features
Model Restrictions
Control which models can be used to manage costs:
# Allow only specific models
export ALLOWED_MODELS="gpt-4o-mini,claude-3-haiku"
# Exclude expensive models
export DISALLOWED_MODELS="o1-preview,claude-3-opus"Conversation Memory
SAGE maintains conversation context across tool calls:
# Automatically continues conversations
# Previous context and files are preserved
# Smart deduplication prevents re-readingCustom Providers
Add custom AI providers by implementing the base provider interface:
class CustomProvider(BaseProvider):
async def generate(self, messages, **kwargs):
# Your implementation
pass🤝 Contributing
We welcome contributions! Please see CONTRIBUTING.md for guidelines.
Development Setup
# Install dev dependencies
pip install -r requirements-dev.txt
# Run tests
pytest
# Format code
black .
ruff check .📄 License
MIT License - see LICENSE for details.
🙏 Acknowledgments
Built on FastMCP framework
Inspired by zen-mcp-server
Powered by Claude MCP protocol
🔗 Links
SAGE-MCP - Your intelligent AI assistant that adapts to how you work 🧙✨