Consult LLM MCP

An MCP server that lets Claude Code consult stronger AI models (o3, Gemini 2.5 Pro, DeepSeek Reasoner) when you need deeper analysis on complex problems.

This SQL query is timing out on large datasets. Can you help optimize it? Ask Gemini

⏺ I'll examine your query and consult Gemini for database optimization strategies.

⏺ consult-llm:consult_llm (MCP)(prompt: "How can I optimize this slow SQL query
                                for better performance on large datasets?", files: ["queries/user
                                -analytics.sql","schema/tables.sql"], model: "gemini-2.5-pro")
  ⎿ The query has significant performance bottlenecks: missing indexes on join
    columns, inefficient subqueries that could be CTEs, and unnecessary DISTINCT
    operations. Add composite indexes on (user_id, created_at), replace correlated
    subqueries with window functions, and eliminate redundant GROUP BY clauses.
    … +287 lines (ctrl+r to expand)

⏺ Great insights! I'll add the suggested indexes, refactor the subqueries to use
  CTEs with window functions, and optimize the GROUP BY operations.

Features

Query powerful AI models (o3, Gemini 2.5 Pro, DeepSeek Reasoner) with relevant files as context
Direct queries with optional file context
Include git changes for code review and analysis
Comprehensive logging with cost estimation

Configuration

OPENAI_API_KEY - Your OpenAI API key (required for o3)
GEMINI_API_KEY - Your Google AI API key (required for Gemini models)
DEEPSEEK_API_KEY - Your DeepSeek API key (required for DeepSeek models)
CONSULT_LLM_DEFAULT_MODEL - Override the default model (optional)
- Options: o3 (default), gemini-2.5-pro, deepseek-reasoner

Usage with Claude Code

Installation

Add the MCP server to Claude Code:

claude mcp add consult-llm -- npx -y consult-llm-mcp

Or for global availability:

claude mcp add --scope user consult-llm -- npx -y consult-llm-mcp

Optionally you can provide environment variables directly in the MCP configuration:

claude mcp add \
  -e OPENAI_API_KEY=your_openai_key \
  -e GEMINI_API_KEY=your_gemini_key \
  -e DEEPSEEK_API_KEY=your_deepseek_key \
    consult-llm -- npx -y consult-llm-mcp

Example workflows

Click to expand.

  <Some long explanation of the issue here>

  Consult LLM for assistance, use gemini

⏺ I see the issue! The problem is that the frontend is trying to use the hint
  for "isn't" but the server generated a hint for "isn't it" (the full phrase).
  This is a mismatch between how the frontend segments text and how the server
  generates hints.

  Let me consult the LLM to get assistance on this segmentation mismatch issue.

⏺ Write(hint-segmentation-issue.md)
  ⎿  Wrote 53 lines to hint-segmentation-issue.md
     # Hint Segmentation Mismatch Issue
     ## Problem Description
     There's a mismatch between how the frontend segments text for clickable
     words and how the backend generates hints. This causes hint unlocking to
     fail when the segments don't match exactly.
     ## Example Case
     **Original Text**: "Hello! Nice weather, isn't it?"
     … +43 lines (ctrl+r to expand)

⏺ consult-llm:consult_llm (MCP)(model: "gemini-2.5-pro", files: ["/project/hint
                                -segmentation-issue.md","/project/web/src/compo
                                nents/HintsText.tsx","/project/web/src/utils/hi
                                nts.ts","/project/server/src/routes/public/hint
                                s.ts","/project/server/src/llm/hints.ts"])
  ⎿ An excellent analysis of the problem. The root cause has been correctly
    identified: a race condition within the `handleConfirmGenerateHints`
    function in `HintsText.tsx`. The component state updates in the background
    after `await generateHints()`, but the function continues executing with a
    stale `activeSegment` variable, causing it to request a hint for a smaller
    text segment ( e.g., "isn't") instead of the full, newly-available hint
    phrase (e.g., "isn't it").
    … +607 lines (ctrl+r to expand)

⏺ Perfect! The LLM identified the core issue - there's a race condition in
  handleConfirmGenerateHints where the function uses stale state after async
  operations. The solution is to re-target the hint key after generation to
  find the correct matching hint.

  Let me implement this solution:

⏺ Update Todos
  ⎿  ☐ Fix hint segmentation mismatch by implementing smarter hint key
       targeting in handleConfirmGenerateHints

MCP Tool: consult_llm

The server provides a single tool called consult_llm for asking powerful AI models complex questions.

Parameters

prompt (required): Your question or request for the consultant LLM
files (optional): Array of file paths to include as context
- All files are added as context with file paths and code blocks
model (optional): LLM model to use
- Options: o3 (default), gemini-2.5-pro, deepseek-reasoner
git_diff (optional): Include git diff output as context
- files (required): Specific files to include in diff
- repo_path (optional): Path to git repository (defaults to current directory)
- base_ref (optional): Git reference to compare against (defaults to HEAD)

Example Usage

Basic prompt:

{
  "prompt": "What are the performance implications of using async/await vs Promise.then() in Node.js?"
}

With file context:

{
  "prompt": "How can I optimize this authentication middleware?",
  "files": ["src/auth.ts", "src/middleware.ts"],
  "model": "o3"
}

With git diff:

{
  "prompt": "Review these authentication changes for security issues",
  "files": ["src/auth.ts"],
  "git_diff": {
    "files": ["src/auth.ts", "src/middleware.ts"],
    "base_ref": "main"
  },
  "model": "gemini-2.5-pro"
}

Supported Models

o3: OpenAI's reasoning model ($2/$8 per million tokens)
gemini-2.5-pro: Google's Gemini 2.5 Pro ($1.25/$10 per million tokens)
deepseek-reasoner: DeepSeek's reasoning model ($0.55/$2.19 per million tokens)

Logging

All prompts and responses are logged to ~/.consult-llm-mcp/logs/mcp.log with:

Tool call parameters
Full prompts and responses
Token usage and cost estimates

[2025-06-22T20:16:04.673Z] TOOL CALL: consult_llm
Arguments: {
  "files": [
    "refactor-analysis.md",
    "src/main.ts",
    "src/schema.ts",
    "src/config.ts",
    "src/llm.ts",
    "src/llm-cost.ts"
  ],
  "model": "deepseek-reasoner"
}
================================================================================
[2025-06-22T20:16:04.675Z] PROMPT (model: deepseek-reasoner):
## Relevant Files

### File: src/main.ts

...

Please provide specific suggestions for refactoring with example code structure
where helpful.
================================================================================
[2025-06-22T20:19:20.632Z] RESPONSE (model: deepseek-reasoner):
Based on the analysis, here are the key refactoring suggestions to improve
separation of concerns and maintainability:

...

This refactoring maintains all existing functionality while significantly
improving maintainability and separation of concerns. The new structure makes
it easier to add features like new LLM providers, additional context sources,
or alternative prompt formats.

Tokens: 3440 input, 5880 output | Cost: $0.014769 (input: $0.001892, output: $0.012877)

CLAUDE.md example

While not strictly necessary, to help Claude Code understand when and how to use this tool, you can optionally something like the following to your project's CLAUDE.md file:

## consult-llm-mcp

Use the `consult_llm` MCP tool to ask a more powerful AI for help with complex
problems. Include files to git_diff when asking feedback for changes.

Use Gemini 2.5 Pro.

CRITICAL: When asking, don't present options, this will bias the answer.

Claude Code seems to know pretty well when to use this MCP even without this instruction however.

Install Server

HTTP connection URL

security – no known vulnerabilities

license - not found

quality - confirmed to work

How are these scores calculated?

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

Tools

consult_llm

An MCP server that lets Claude Code consult stronger AI models (o3, Gemini 2.5 Pro, DeepSeek Reasoner) when you need deeper analysis on complex problems.

Related MCP Servers

OpenAI Agents MCP Server
lroolle
-
security
F
license
-
quality
A Model Context Protocol server that enables Claude users to access specialized OpenAI agents (web search, file search, computer actions) and a multi-agent orchestrator through the MCP protocol.
Last updated -
1
Python
MCP Claude Code
SDGLBL
-
security
A
license
-
quality
An MCP server that implements Claude Code-like functionality, allowing the AI to analyze codebases, modify files, execute commands, and manage projects through direct file system interactions.
Last updated -
179
Python
MIT License
Yellhorn MCP
msnidal
A
security
F
license
A
quality
An MCP server that connects Gemini 2.5 Pro to Claude Code, enabling users to generate detailed implementation plans based on their codebase and receive feedback on code changes.
Last updated -
2
3
Python
You.com MCP Server
jimbul
A
security
A
license
A
quality
MCP server that provides Claude AI assistants with the ability to search the web, get news, and perform research using the You.com API.
Last updated -
4
TypeScript
MIT License

View all related MCP servers

Consult LLM MCP

Consult LLM MCP

Features

Configuration

Usage with Claude Code

Installation

Example workflows

MCP Tool: consult_llm

Parameters

Example Usage

Supported Models

Logging

CLAUDE.md example

Tools

Related MCP Servers

OpenAI Agents MCP Server

MCP Claude Code

Yellhorn MCP

You.com MCP Server

New MCP Servers

MCP directory API