Enables pre-commit validation of git changes across multiple repositories, with the ability to detect incomplete changes, security issues, and ensure implementation matches intent.
Integrates with Gemini 2.5 Pro and 2.0 Flash models for extended thinking capabilities, deep analysis, and ultra-fast responses with 1M token context.
Provides access to OpenAI's O3 and O3-mini models for strong logical reasoning and systematic analysis with a 200K token context window.
Uses Redis for AI-to-AI conversation persistence, enabling multi-turn conversations between Claude and other AI models with full context retention.
Zen MCP: One Context. Many Minds.
https://github.com/user-attachments/assets/8097e18e-b926-4d8b-ba14-a979e4c58bda
The ultimate development partners for Claude - a Model Context Protocol server that gives Claude access to multiple AI models for enhanced code analysis, problem-solving, and collaborative development.
Features true AI orchestration with conversations that continue across tasks - Give Claude a complex task and let it orchestrate between models automatically. Claude stays in control, performs the actual work, but gets perspectives from the best AI for each subtask. Claude can switch between different tools and models mid-conversation, with context carrying forward seamlessly.
Example Workflow - Claude Code:
- Performs its own reasoning
- Uses Gemini Pro to deeply
analyze
the code in question for a second opinion - Switches to O3 to continue
chatting
about its findings - Uses Flash to evaluate formatting suggestions from O3
- Performs the actual work after taking in feedback from all three
- Returns to Pro for a
precommit
review
All within a single conversation thread! Gemini Pro in step 6 knows what was recommended by O3 in step 3! Taking that context and review into consideration to aid with its pre-commit review.
Think of it as Claude Code for Claude Code. This MCP isn't magic. It's just super-glue.
Remember: Claude stays in full control — but YOU call the shots. Zen is designed to have Claude engage other models only when needed — and to follow through with meaningful back-and-forth. You're the one who crafts the powerful prompt that makes Claude bring in Gemini, Flash, O3 — or fly solo.
You're the guide. The prompter. The puppeteer.You are the AI - Actually Intelligent.
Quick Navigation
- Getting Started
- Quickstart - Get running in 5 minutes with Docker
- Available Tools - Overview of all tools
- AI-to-AI Conversations - Multi-turn conversations
- Tools Reference
- Advanced Usage
- Advanced Features - AI-to-AI conversations, large prompts, web search
- Complete Advanced Guide - Model configuration, thinking modes, workflows, tool parameters
- Setup & Support
- Troubleshooting Guide - Common issues and debugging steps
- License - Apache 2.0
Why This Server?
Claude is brilliant, but sometimes you need:
- Multiple AI perspectives - Let Claude orchestrate between different models to get the best analysis
- Automatic model selection - Claude picks the right model for each task (or you can specify)
- A senior developer partner to validate and extend ideas (
chat
) - A second opinion on complex architectural decisions - augment Claude's thinking with perspectives from Gemini Pro, O3, or dozens of other models via custom endpoints (
thinkdeep
) - Professional code reviews with actionable feedback across entire repositories (
codereview
) - Pre-commit validation with deep analysis using the best model for the job (
precommit
) - Expert debugging - O3 for logical issues, Gemini for architectural problems (
debug
) - Extended context windows beyond Claude's limits - Delegate analysis to Gemini (1M tokens) or O3 (200K tokens) for entire codebases, large datasets, or comprehensive documentation
- Model-specific strengths - Extended thinking with Gemini Pro, fast iteration with Flash, strong reasoning with O3, local privacy with Ollama
- Local model support - Run models like Llama 3.2 locally via Ollama, vLLM, or LM Studio for privacy and cost control
- Dynamic collaboration - Models can request additional context and follow-up replies from Claude mid-analysis
- Smart file handling - Automatically expands directories, manages token limits based on model capacity
- Bypass MCP's token limits - Work around MCP's 25K limit automatically
This server orchestrates multiple AI models as your development team, with Claude automatically selecting the best model for each task or allowing you to choose specific models for different strengths.
Prompt Used:
The final implementation resulted in a 26% improvement in JSON parsing performance for the selected library, reducing processing time through targeted, collaborative optimizations guided by Gemini’s analysis and Claude’s refinement.
Quickstart (5 minutes)
Prerequisites
- Docker Desktop installed (Download here)
- Git
- Windows users: WSL2 is required for Claude Code CLI
1. Get API Keys (at least one required)
Option A: OpenRouter (Access multiple models with one API)
- OpenRouter: Visit OpenRouter for access to multiple models through one API. Setup Guide
- Control model access and spending limits directly in your OpenRouter dashboard
- Configure model aliases in
conf/custom_models.json
Option B: Native APIs
- Gemini: Visit Google AI Studio and generate an API key. For best results with Gemini 2.5 Pro, use a paid API key as the free tier has limited access to the latest models.
- OpenAI: Visit OpenAI Platform to get an API key for O3 model access.
Option C: Custom API Endpoints (Local models like Ollama, vLLM) Please see the setup guide. With a custom API you can use:
- Ollama: Run models like Llama 3.2 locally for free inference
- vLLM: Self-hosted inference server for high-throughput inference
- LM Studio: Local model hosting with OpenAI-compatible API interface
- Text Generation WebUI: Popular local interface for running models
- Any OpenAI-compatible API: Custom endpoints for your own infrastructure
Note: Using all three options may create ambiguity about which provider / model to use if there is an overlap. If all APIs are configured, native APIs will take priority when there is a clash in model name, such as for
gemini
ando3
. Configure your model aliases and give them unique names inconf/custom_models.json
2. Clone and Set Up
What this does:
- Builds Docker images with all dependencies (including Redis for conversation threading)
- Creates .env file (automatically uses
$GEMINI_API_KEY
and$OPENAI_API_KEY
if set in environment) - Starts Redis service for AI-to-AI conversation memory
- Starts MCP server with providers based on available API keys
- Adds Zen to Claude Code automatically
3. Add Your API Keys
4. Configure Claude
If Setting up for Claude Code
Run the following commands on the terminal to add the MCP directly to Claude Code
Now run claude
on the terminal for it to connect to the newly added mcp server. If you were already running a claude
code session,
please exit and start a new session.
If Setting up for Claude Desktop
- Open Claude Desktop
- Go to Settings → Developer → Edit Config
This will open a folder revealing claude_desktop_config.json
.
- ** Update Docker Configuration**
The setup script shows you the exact configuration. It looks like this. When you ran run-server.sh
it should
have produced a configuration for you to copy:
Paste the above into claude_desktop_config.json
. If you have several other MCP servers listed, simply add this below the rest after a ,
comma:
- Restart Claude Desktop Completely quit and restart Claude Desktop for the changes to take effect.
5. Start Using It!
Just ask Claude naturally:
- "Think deeper about this architecture design with zen" → Claude picks best model +
thinkdeep
- "Using zen perform a code review of this code for security issues" → Claude might pick Gemini Pro +
codereview
- "Use zen and debug why this test is failing, the bug might be in my_class.swift" → Claude might pick O3 +
debug
- "With zen, analyze these files to understand the data flow" → Claude picks appropriate model +
analyze
- "Use flash to suggest how to format this code based on the specs mentioned in policy.md" → Uses Gemini Flash specifically
- "Think deeply about this and get o3 to debug this logic error I found in the checkOrders() function" → Uses O3 specifically
- "Brainstorm scaling strategies with pro. Study the code, pick your preferred strategy and debate with pro to settle on two best approaches" → Uses Gemini Pro specifically
- "Use local-llama to localize and add missing translations to this project" → Uses local Llama 3.2 via custom URL
- "First use local-llama for a quick local analysis, then use opus for a thorough security review" → Uses both providers in sequence
Available Tools
Quick Tool Selection Guide:
- Need a thinking partner? →
chat
(brainstorm ideas, get second opinions, validate approaches) - Need deeper thinking? →
thinkdeep
(extends analysis, finds edge cases) - Code needs review? →
codereview
(bugs, security, performance issues) - Pre-commit validation? →
precommit
(validate git changes before committing) - Something's broken? →
debug
(root cause analysis, error tracing) - Want to understand code? →
analyze
(architecture, patterns, dependencies) - Server info? →
get_version
(version and configuration details)
Auto Mode: When DEFAULT_MODEL=auto
, Claude automatically picks the best model for each task. You can override with: "Use flash for quick analysis" or "Use o3 to debug this".
Model Selection Examples:
- Complex architecture review → Claude picks Gemini Pro
- Quick formatting check → Claude picks Flash
- Logical debugging → Claude picks O3
- General explanations → Claude picks Flash for speed
- Local analysis → Claude picks your Ollama model
Pro Tip: Thinking modes (for Gemini models) control depth vs token cost. Use "minimal" or "low" for quick tasks, "high" or "max" for complex problems. Learn more
Tools Overview:
chat
- Collaborative thinking and development conversationsthinkdeep
- Extended reasoning and problem-solvingcodereview
- Professional code review with severity levelsprecommit
- Validate git changes before committingdebug
- Root cause analysis and debugginganalyze
- General-purpose file and code analysisget_version
- Get server version and configuration
1. chat
- General Development Chat & Collaborative Thinking
Your thinking partner - bounce ideas, get second opinions, brainstorm collaboratively
Thinking Mode: Default is medium
(8,192 tokens). Use low
for quick questions to save tokens, or high
for complex discussions when thoroughness matters.
Example Prompt:
Key Features:
- Collaborative thinking partner for your analysis and planning
- Get second opinions on your designs and approaches
- Brainstorm solutions and explore alternatives together
- Validate your checklists and implementation plans
- General development questions and explanations
- Technology comparisons and best practices
- Architecture and design discussions
- Can reference files for context:
"Use gemini to explain this algorithm with context from algorithm.py"
- Dynamic collaboration: Gemini can request additional files or context during the conversation if needed for a more thorough response
- Web search capability: Analyzes when web searches would be helpful and recommends specific searches for Claude to perform, ensuring access to current documentation and best practices
2. thinkdeep
- Extended Reasoning Partner
Get a second opinion to augment Claude's own extended thinking
Thinking Mode: Default is high
(16,384 tokens) for deep analysis. Claude will automatically choose the best mode based on complexity - use low
for quick validations, medium
for standard problems, high
for complex issues (default), or max
for extremely complex challenges requiring deepest analysis.
Example Prompt:
Key Features:
- Uses Gemini's specialized thinking models for enhanced reasoning capabilities
- Provides a second opinion on Claude's analysis
- Challenges assumptions and identifies edge cases Claude might miss
- Offers alternative perspectives and approaches
- Validates architectural decisions and design patterns
- Can reference specific files for context:
"Use gemini to think deeper about my API design with reference to api/routes.py"
- Enhanced Critical Evaluation (v2.10.0): After Gemini's analysis, Claude is prompted to critically evaluate the suggestions, consider context and constraints, identify risks, and synthesize a final recommendation - ensuring a balanced, well-considered solution
- Web search capability: When enabled (default: true), identifies areas where current documentation or community solutions would strengthen the analysis and suggests specific searches for Claude
3. codereview
- Professional Code Review
Comprehensive code analysis with prioritized feedback
Thinking Mode: Default is medium
(8,192 tokens). Use high
for security-critical code (worth the extra tokens) or low
for quick style checks (saves ~6k tokens).
Example Prompts:
Key Features:
- Issues prioritized by severity (🔴 CRITICAL → 🟢 LOW)
- Supports specialized reviews: security, performance, quick
- Can enforce coding standards:
"Use gemini to review src/ against PEP8 standards"
- Filters by severity:
"Get gemini to review auth/ - only report critical vulnerabilities"
4. precommit
- Pre-Commit Validation
Comprehensive review of staged/unstaged git changes across multiple repositories
Thinking Mode: Default is medium
(8,192 tokens). Use high
or max
for critical releases when thorough validation justifies the token cost.
Prompt Used:
How beautiful is that? Claude used precommit
twice and codereview
once and actually found and fixed two critical errors before commit!
Example Prompts:
Key Features:
- Recursive repository discovery - finds all git repos including nested ones
- Validates changes against requirements - ensures implementation matches intent
- Detects incomplete changes - finds added functions never called, missing tests, etc.
- Multi-repo support - reviews changes across multiple repositories in one go
- Configurable scope - review staged, unstaged, or compare against branches
- Security focused - catches exposed secrets, vulnerabilities in new code
- Smart truncation - handles large diffs without exceeding context limits
Parameters:
path
: Starting directory to search for repos (default: current directory)original_request
: The requirements for contextcompare_to
: Compare against a branch/tag instead of local changesreview_type
: full|security|performance|quickseverity_filter
: Filter by issue severitymax_depth
: How deep to search for nested repos
5. debug
- Expert Debugging Assistant
Root cause analysis for complex problems
Thinking Mode: Default is medium
(8,192 tokens). Use high
for tricky bugs (investment in finding root cause) or low
for simple errors (save tokens).
Example Prompts:
Basic Usage:
Key Features:
- Generates multiple ranked hypotheses for systematic debugging
- Accepts error context, stack traces, and logs
- Can reference relevant files for investigation
- Supports runtime info and previous attempts
- Provides structured root cause analysis with validation steps
- Can request additional context when needed for thorough analysis
- Web search capability: When enabled (default: true), identifies when searching for error messages, known issues, or documentation would help solve the problem and recommends specific searches for Claude
6. analyze
- Smart File Analysis
General-purpose code understanding and exploration
Thinking Mode: Default is medium
(8,192 tokens). Use high
for architecture analysis (comprehensive insights worth the cost) or low
for quick file overviews (save ~6k tokens).
Example Prompts:
Basic Usage:
Key Features:
- Analyzes single files or entire directories
- Supports specialized analysis types: architecture, performance, security, quality
- Uses file paths (not content) for clean terminal output
- Can identify patterns, anti-patterns, and refactoring opportunities
- Web search capability: When enabled with
use_websearch
(default: true), the model can request Claude to perform web searches and share results back to enhance analysis with current documentation, design patterns, and best practices
7. get_version
- Server Information
For detailed tool parameters and configuration options, see the Advanced Usage Guide.
Advanced Features
AI-to-AI Conversation Threading
This server enables true AI collaboration between Claude and multiple AI models (Gemini, O3), where they can coordinate and question each other's approaches:
How it works:
- Gemini can ask Claude follow-up questions to clarify requirements or gather more context
- Claude can respond with additional information, files, or refined instructions
- Claude can work independently between exchanges - implementing solutions, gathering data, or performing analysis
- Claude can return to Gemini with progress updates and new context for further collaboration
- Cross-tool continuation - Start with one tool (e.g.,
analyze
) and continue with another (e.g.,codereview
) using the same conversation thread - Both AIs coordinate their approaches - questioning assumptions, validating solutions, and building on each other's insights
- Each conversation maintains full context while only sending incremental updates
- Conversations are automatically managed with Redis for persistence
Example of Multi-Model AI Coordination:
- You: "Debate SwiftUI vs UIKit - which is better for iOS development?"
- Claude (auto mode): "I'll orchestrate a debate between different models for diverse perspectives."
- Gemini Pro: "From an architectural standpoint, SwiftUI's declarative paradigm and state management make it superior for maintainable, modern apps."
- O3: "Logically analyzing the trade-offs: UIKit offers 15+ years of stability, complete control, and proven scalability. SwiftUI has <5 years maturity with ongoing breaking changes."
- Claude: "Let me get Flash's quick take on developer experience..."
- Gemini Flash: "SwiftUI = faster development, less code, better previews. UIKit = more control, better debugging, stable APIs."
- Claude's synthesis: "Based on the multi-model analysis: Use SwiftUI for new projects prioritizing development speed, UIKit for apps requiring fine control or supporting older iOS versions."
Asynchronous workflow example:
- Claude can work independently between exchanges (analyzing code, implementing fixes, gathering data)
- Return to Gemini with progress updates and additional context
- Each exchange shares only incremental information while maintaining full conversation history
- Automatically bypasses MCP's 25K token limits through incremental updates
Enhanced collaboration features:
- Cross-questioning: AIs can challenge each other's assumptions and approaches
- Coordinated problem-solving: Each AI contributes their strengths to complex problems
- Context building: Claude gathers information while Gemini provides deep analysis
- Approach validation: AIs can verify and improve each other's solutions
- Cross-tool continuation: Seamlessly continue conversations across different tools while preserving all context
- Asynchronous workflow: Conversations don't need to be sequential - Claude can work on tasks between exchanges, then return to Gemini with additional context and progress updates
- Incremental updates: Share only new information in each exchange while maintaining full conversation history
- Automatic 25K limit bypass: Each exchange sends only incremental context, allowing unlimited total conversation size
- Up to 10 exchanges per conversation (configurable via
MAX_CONVERSATION_TURNS
) with 3-hour expiry (configurable viaCONVERSATION_TIMEOUT_HOURS
) - Thread-safe with Redis persistence across all tools
Cross-tool & Cross-Model Continuation Example:
For more advanced features like working with large prompts and dynamic context requests, see the Advanced Usage Guide.
Configuration
Auto Mode (Recommended): Set DEFAULT_MODEL=auto
in your .env file and Claude will intelligently select the best model for each task.
Available Models:
pro
(Gemini 2.5 Pro): Extended thinking, deep analysisflash
(Gemini 2.0 Flash): Ultra-fast responseso3
: Strong logical reasoningo3mini
: Balanced speed/qualityo4-mini
: Latest reasoning model, optimized for shorter contextso4-mini-high
: Enhanced O4 with higher reasoning effort- Custom models: via OpenRouter or local APIs (Ollama, vLLM, etc.)
For detailed configuration options, see the Advanced Usage Guide.
Testing
For information on running tests and contributing, see the Testing Guide.
License
Apache 2.0 License - see LICENSE file for details.
Acknowledgments
Built with the power of Multi-Model AI collaboration 🤝
- MCP (Model Context Protocol) by Anthropic
- Claude Code - Your AI coding assistant & orchestrator
- Gemini 2.5 Pro & 2.0 Flash - Extended thinking & fast analysis
- OpenAI O3 - Strong reasoning & general intelligence
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
A Model Context Protocol server that gives Claude access to multiple AI models (Gemini, OpenAI, OpenRouter) for enhanced code analysis, problem-solving, and collaborative development through AI orchestration with conversations that continue across tasks.
Related MCP Servers
- -securityFlicense-qualityA comprehensive suite of Model Context Protocol servers designed to extend AI agent Claude's capabilities with integrations for knowledge management, reasoning, advanced search, news access, and workspace tools.Last updated -5TypeScript
- -securityFlicense-qualityA Model Context Protocol server that allows Claude to make API requests on your behalf, providing tools for testing various APIs including HTTP requests and OpenAI integrations without sharing your API keys in the chat.Last updated -Python
- -securityFlicense-qualityA Model Context Protocol server that enables Claude users to access specialized OpenAI agents (web search, file search, computer actions) and a multi-agent orchestrator through the MCP protocol.Last updated -1Python
- -securityAlicense-qualityA Model Context Protocol server that enables Claude to collaborate with Google's Gemini AI models, providing tools for question answering, code review, brainstorming, test generation, and explanations.Last updated -PythonMIT License