The Consult LLM MCP server enables Claude Code to consult more powerful AI models for complex problem analysis:
Query powerful AI models: Access OpenAI's o3 (default), Google's Gemini 2.5 Pro, and DeepSeek Reasoner for specialized problem-solving.
Context integration: Process markdown files as primary prompts while including other files as supplementary context.
Direct prompting: Submit simple text questions or construct automatic prompts from markdown and code files.
Git diff integration: Include code changes as additional context for more accurate analysis.
Task specialization: Address specific needs like code implementation, review, bug analysis, and architecture advice.
Usage tracking: Monitor interactions with cost estimation and comprehensive logging of prompts, responses, token usage, and parameters.
Provides ability to feed code changes through git diff as context for AI model queries
Enables querying Google's Gemini 2.5 Pro model with file context and automatically constructed prompts from markdown and code files
Supports automatic prompt construction from markdown files which become the main prompt when querying AI models
Enables querying OpenAI's o3 model with file context and automatically constructed prompts from markdown and code files
Consult LLM MCP
An MCP server that lets Claude Code consult stronger AI models (o3, Gemini 2.5 Pro, DeepSeek Reasoner, GPT-5.1 Codex) when Sonnet has you running in circles and you need to bring in the heavy artillery.
Features
Query powerful AI models (o3, Gemini 2.5 Pro, Gemini 3 Pro Preview, DeepSeek Reasoner, GPT-5.1 Codex) with relevant files as context
Direct queries with optional file context
Include git changes for code review and analysis
Comprehensive logging with cost estimation
Gemini CLI mode: Use the
geminiCLI to take advantage of free quotaCodex CLI mode: Use the
codexCLI for OpenAI modelsWeb mode: Copy formatted prompts to clipboard for browser-based LLM services
Simple: provides just one MCP tool to not clutter the context
Related MCP server: MCP Claude Code
Quick start
Add to Claude Code:
claude mcp add consult-llm -e GEMINI_API_KEY=your_key -- npx -y consult-llm-mcpFor global availability across projects, add
--scope user.claude mcp add consult-llm \ -e OPENAI_API_KEY=your_openai_key \ -e GEMINI_API_KEY=your_gemini_key \ -e DEEPSEEK_API_KEY=your_deepseek_key \ -e GEMINI_MODE=cli \ -- npx -y consult-llm-mcpVerify connection with
/mcp:❯ 1. consult-llm ✔ connectedAsk a question:
"Consult Gemini about how to fix the race condition in server.ts"
Example workflows
Some real-world examples. Click to expand.
This is useful when:
You want to use a free browser-based LLM service instead of API credits
You prefer a specific LLM's web interface
You want to review the full prompt before submitting it
This example shows using the /consult slash command to ask multiple LLMs
(Gemini and Codex) about the same problem in parallel and compare their
responses. Both LLMs independently arrived at the same solution, providing
confidence in the approach.
Modes
consult-llm-mcp supports three modes of operation:
Mode | Description | When to use |
API | Queries LLM APIs directly | You have API keys and want the simplest setup |
CLI | Shells out to local CLI tools | Free quota (Gemini), existing subscriptions, or prefer CLI tools |
Web | Copies prompt to clipboard | You prefer browser UIs or want to review prompts |
API mode (default)
The default mode. Requires API keys configured via environment variables. See Configuration for details.
CLI mode
Instead of making API calls, shell out to local CLI tools. The CLI agents can explore the codebase themselves, so you don't need to pass all relevant files as context, but it helps.
Gemini CLI
Use Gemini's local CLI to take advantage of Google's free quota.
Requirements:
Install the Gemini CLI
Authenticate via
gemini login
Setup:
Codex CLI
Use OpenAI's Codex CLI for OpenAI models.
Requirements:
Install the Codex CLI
Authenticate via
codex login
Setup:
Set reasoning effort with-e CODEX_REASONING_EFFORT=high. Options:
none, minimal, low, medium, high, xhigh (gpt-5.1-codex-max only).
Web mode
Copies the formatted prompt to clipboard instead of querying an LLM. Paste into any browser-based LLM (ChatGPT, Claude.ai, Gemini, etc.).
When to use: Prefer a specific web UI, want to review the prompt first, or don't have API keys.
Workflow:
Ask Claude to "use consult LLM with web mode"
Paste into your browser-based LLM
Paste the response back into Claude Code
See the "Using web mode..." example above for a concrete transcript.
Configuration
Environment variables
OPENAI_API_KEY- Your OpenAI API key (required for OpenAI models in API mode)GEMINI_API_KEY- Your Google AI API key (required for Gemini models in API mode)DEEPSEEK_API_KEY- Your DeepSeek API key (required for DeepSeek models)CONSULT_LLM_DEFAULT_MODEL- Override the default model (optional)Options:
o3(default),gemini-2.5-pro,gemini-3-pro-preview,deepseek-reasoner,gpt-5.2,gpt-5.1-codex-max,gpt-5.1-codex,gpt-5.1-codex-mini,gpt-5.1
GEMINI_MODE- Choose between API or CLI mode for Gemini models (optional)Options:
api(default),cliCLI mode uses the system-installed
geminiCLI tool
OPENAI_MODE- Choose between API or CLI mode for OpenAI models (optional)Options:
api(default),cliCLI mode uses the system-installed
codexCLI tool
CODEX_REASONING_EFFORT- Configure reasoning effort for Codex CLI (optional)See Codex CLI for details and available options
CONSULT_LLM_ALLOWED_MODELS- List of allowed models (optional)Comma-separated list, e.g.,
o3,gemini-3-pro-previewIf not set, all models are available in MCP schema
Custom system prompt
You can customize the system prompt used when consulting LLMs by creating a
SYSTEM_PROMPT.md file in ~/.consult-llm-mcp/:
This creates a placeholder file with the default system prompt that you can edit to customize how the consultant LLM behaves. The custom prompt is read on every request, so changes take effect immediately without restarting the server.
To revert to the default prompt, simply delete the SYSTEM_PROMPT.md file.
MCP tool: consult_llm
The server provides a single tool called consult_llm for asking powerful AI
models complex questions.
Parameters
prompt (required): Your question or request for the consultant LLM
files (optional): Array of file paths to include as context
All files are added as context with file paths and code blocks
model (optional): LLM model to use
Options:
o3(default),gemini-2.5-pro,gemini-3-pro-preview,deepseek-reasoner,gpt-5.2,gpt-5.1-codex,gpt-5.1-codex-mini,gpt-5.1
web_mode (optional): Copy prompt to clipboard instead of querying LLM
Default:
falseWhen
true, the formatted prompt (including system prompt and file contents) is copied to clipboard for manual pasting into browser-based LLM services
git_diff (optional): Include git diff output as context
files (required): Specific files to include in diff
repo_path (optional): Path to git repository (defaults to current directory)
base_ref (optional): Git reference to compare against (defaults to HEAD)
Supported models
o3: OpenAI's reasoning model ($2/$8 per million tokens)
gemini-2.5-pro: Google's Gemini 2.5 Pro ($1.25/$10 per million tokens)
gemini-3-pro-preview: Google's Gemini 3 Pro Preview ($2/$12 per million tokens for prompts ≤200k tokens, $4/$18 for prompts >200k tokens)
deepseek-reasoner: DeepSeek's reasoning model ($0.55/$2.19 per million tokens)
gpt-5.2: OpenAI's latest GPT model
gpt-5.1-codex: OpenAI's Codex model optimized for coding
gpt-5.1-codex-mini: Lighter, faster version of gpt-5.1-codex
gpt-5.1: Broad world knowledge with strong general reasoning
Logging
All prompts and responses are logged to ~/.consult-llm-mcp/logs/mcp.log with:
Tool call parameters
Full prompts and responses
Token usage and cost estimates
Activation methods
1. No custom activation (simplest)
When you add an MCP to Claude Code, the tool's schema is injected into the agent's context. This allows Claude to infer when to call the MCP from natural language (e.g., "ask gemini about..."). Works out of the box, but you have less control over how the MCP is invoked.
2. Slash commands (most reliable)
Explicitly invoke with /consult ask gemini about X. Guaranteed activation with
full control over custom instructions, but requires the explicit syntax. For
example, you can instruct Claude to always find related files and pass them as
context via the files parameter. See the
example slash command below.
3. Skills
Automatically triggers when Claude detects matching intent. Like slash commands, supports custom instructions (e.g., always gathering relevant files), but not always reliably triggered. See the example skill below.
Recommendation: Start with no custom activation. Use slash commands if you need reliability or custom instructions.
Example skill
Here's an example Claude Code skill
that uses the consult_llm MCP tool to create commands like "ask gemini" or
"ask codex". See examples/SKILL.md for the full content.
Save it as ~/.claude/skills/consult-llm/SKILL.md and you can then use it by
typing "ask gemini about X" or "ask codex about X" in Claude Code.
This one is not strictly necessary either, Claude (or other agent) can infer from the schema that "Ask gemini" should call this MCP, but it might be helpful in case you want to have more precise control over how the agent calls this MCP.
Example slash command
Here's an example
Claude Code slash command that
uses the consult_llm MCP tool. See examples/consult.md
for the full content.
Save it as ~/.claude/commands/consult.md and you can then use it by typing
/consult ask gemini about X or /consult ask codex about X in Claude Code.
Development
To work on the MCP server locally and use your development version:
Clone the repository and install dependencies:
git clone https://github.com/yourusername/consult-llm-mcp.git cd consult-llm-mcp npm installBuild the project:
npm run buildInstall globally from the local directory:
npm linkAdd the MCP server to Claude Code using the global command:
claude mcp add consult-llm -- consult-llm-mcp
Now when you make changes:
Rebuild:
npm run buildRestart Claude Code to pick up the changes
Alternatively, you can use the dev script for development without building:
This runs the TypeScript source directly with tsx, allowing faster iteration
without rebuilding.
To unlink the global version later:
Related projects
workmux — Git worktrees + tmux windows for parallel AI agent workflows
claude-history — Search and view Claude Code conversation history with fzf
tmux-file-picker — Pop up fzf in tmux to quickly insert file paths, perfect for AI coding assistants