Which integrations are available for this server?

Provides an interface to the Google Gemini CLI, enabling AI agents to leverage Gemini's massive 1M+ token context window for large-scale codebase analysis, architectural reviews, and structured code editing.

Ask LLM

Release GitHub Release License: MIT

Package	Type	Version	Downloads
`@ask-llm/gemini-mcp`	MCP Server
`@ask-llm/codex-mcp`	MCP Server
`@ask-llm/claude-mcp`	MCP Server
`@ask-llm/ollama-mcp`	MCP Server
`@ask-llm/antigravity-mcp`	MCP Server
`@ask-llm/mcp`	MCP Server
`@ask-llm/plugin`	Claude Code Plugin		`/plugin install`

MCP servers + Claude Code plugin for AI-to-AI collaboration

Get a second opinion before you ship. Ask LLM lets your AI assistant — Claude Code, Codex CLI, Cursor, Claude Desktop, or any of 40+ MCP clients — consult a second model to review your code, debate a plan, or catch a bug it might have missed. Pick the reviewer that fits: OpenAI Codex (GPT-5.6 Sol → Terra), Anthropic Claude (Opus → Sonnet), Google Antigravity (agy), a local Ollama model, or Gemini (1M+ token context). Standard MCP, no prompt hacks.

⚠️ Gemini CLI goes enterprise-only on 2026-06-18: From that date Google restricts Gemini CLI to Gemini Code Assist Standard/Enterprise seats, and free, Google AI Pro, and Ultra accounts lose access. @ask-llm/gemini-mcp still installs, but a non-enterprise account then surfaces actionable guidance instead of output. Free/Pro users: switch to ask-antigravity (the Google-sanctioned successor, subscription-backed via Google AI Pro/Ultra), ask-codex, ask-claude, or ask-ollama. Announcement

Why a second opinion?

Your primary AI is confident — but confidence isn't correctness. A second model, with no stake in the first one's answer, catches what it missed.

Second opinion on code — before you commit to an approach, have another model review it independently.
Debate a plan — send an architecture proposal for critique, alternatives, and trade-off analysis.
Review a diff — have a different model analyze your changes to surface issues your primary AI glossed over.
Read more than fits — Gemini and Antigravity's large context windows ingest whole codebases at once.
Keep it local — run reviews through Ollama when nothing can leave your machine.

Related MCP server: SystemPrompt Coding Agent

In action

You:    ask codex to review src/auth.ts for security issues
Codex:  ⚠ verifyToken() compares tokens with === — not timing-safe (line 42)
        ⚠ the session cookie is missing a SameSite attribute
Claude: Good catches — applying both fixes to src/auth.ts.

One prompt. A second model reviews independently; your assistant applies the fix — no copy-paste between tools.

Quick Start

Claude Code

# All-in-one — auto-detects installed providers
claude mcp add --scope user ask-llm -- npx -y @ask-llm/mcp

claude mcp add --scope user gemini -- npx -y @ask-llm/gemini-mcp
claude mcp add --scope user codex -- npx -y @ask-llm/codex-mcp
claude mcp add --scope user ollama -- npx -y @ask-llm/ollama-mcp
claude mcp add --scope user antigravity -- npx -y @ask-llm/antigravity-mcp

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "ask-llm": {
      "command": "npx",
      "args": ["-y", "@ask-llm/mcp"]
    }
  }
}

{
  "mcpServers": {
    "gemini": {
      "command": "npx",
      "args": ["-y", "@ask-llm/gemini-mcp"]
    },
    "codex": {
      "command": "npx",
      "args": ["-y", "@ask-llm/codex-mcp"]
    },
    "ollama": {
      "command": "npx",
      "args": ["-y", "@ask-llm/ollama-mcp"]
    }
  }
}

Cursor (.cursor/mcp.json):

{
  "mcpServers": {
    "ask-llm": { "command": "npx", "args": ["-y", "@ask-llm/mcp"] }
  }
}

Codex CLI (~/.codex/config.toml):

[mcp_servers.ask-llm]
command = "npx"
args = ["-y", "@ask-llm/mcp"]

For the focused Codex → Claude second-opinion path:

codex mcp add claude -- npx -y @ask-llm/claude-mcp

Any MCP Client (STDIO transport):

{ "command": "npx", "args": ["-y", "@ask-llm/mcp"] }

Replace @ask-llm/mcp with @ask-llm/codex-mcp, @ask-llm/claude-mcp, @ask-llm/antigravity-mcp, @ask-llm/ollama-mcp, or @ask-llm/gemini-mcp for a single provider.

Migrating from the old package names

All public MCP packages now live in the @ask-llm npm organization. The old package names are deprecated, but their executable names are unchanged. Update the package argument in your MCP config; commands such as ask-codex-mcp and ask-llm-mcp doctor keep working after a global install.

Old package	Use instead
`ask-gemini-mcp`	`@ask-llm/gemini-mcp`
`ask-codex-mcp`	`@ask-llm/codex-mcp`
`@anton-lykhoyda/ask-claude-mcp`	`@ask-llm/claude-mcp`
`ask-ollama-mcp`	`@ask-llm/ollama-mcp`
`ask-antigravity-mcp`	`@ask-llm/antigravity-mcp`
`ask-llm-mcp`	`@ask-llm/mcp`

See the installation guide for the complete package-to-executable mapping.

Choose your reviewer

Provider	Best for	Model (default → fallback)	Notes
Codex	Code reasoning, targeted reviews, architecture critique	`gpt-5.6-sol` → `gpt-5.6-terra`	Requires an OpenAI/Codex account
Claude	Independent review from Codex or another non-Claude host	`opus` → `sonnet`	Claude Code CLI; native sessions; read-only tools
Antigravity	A subscription-backed second opinion; larger-context reads	`Gemini 3.1 Pro (High)` → `Gemini 3.5 Flash (High)`	Google AI Pro/Ultra plan; one-shot, experimental
Ollama	Private/local review, zero cost, offline	`qwen3.6:27b` (no auto-fallback)	Runs entirely on your machine
Gemini	Whole-codebase reads (1M+ tokens)	`gemini-3.1-pro-preview` → `gemini-3.5-flash`	⚠️ Enterprise-gated from 2026-06-18
Unified (`ask-llm`)	One install for all of the above; fan out in parallel	routes per call	Recommended

Claude Code Plugin

The Ask LLM plugin adds multi-provider code review, brainstorming, and automated hooks directly into Claude Code:

/plugin marketplace add Lykhoyda/ask-llm
/plugin install ask-llm@ask-llm-plugins

What You Get

Feature	Description
`/multi-review`	Parallel Antigravity + Codex review with 4-phase validation pipeline and consensus highlighting (gemini via `/gemini-review`)
`/gemini-review`	Gemini-only review with confidence filtering
`/codex-review`	Codex-only review with confidence filtering
`/fable-review`	Isolated, read-only review that requests the native Fable model and discloses runtime verification limits
`/sol-review`	Model-pinned GPT-5.6 Sol review through Codex
`/ollama-review`	Local review — no data leaves your machine
`/antigravity-review`	Subscription-backed review via Google Antigravity (`agy`) — experimental
`/brainstorm`	Multi-LLM brainstorm: Claude Opus researches the topic against real files in parallel with external providers (Gemini/Codex/Ollama), then synthesizes all findings with verified findings weighted higher
`/compare`	Side-by-side raw responses from multiple providers, no synthesis — for when you want to see how each provider phrases the same answer
`codex-pair` hook	Opt-in continuous review — runs Codex against every Edit/Write/MultiEdit when a `.codex-pair/context.md` marker is present in the project

The review agents use a 4-phase pipeline inspired by Anthropic's code-review plugin: context gathering, prompt construction with explicit false-positive exclusions, synthesis, and source-level validation of each finding.

See the plugin docs for details.

Prerequisites

Node.js v20.0.0 or higher (LTS)
At least one provider:
- Codex CLI — installed and authenticated
- Claude Code CLI — installed and authenticated (for Codex/other clients consulting Claude)
- Antigravity CLI (agy) — installed and logged in once (Google AI Pro/Ultra)
- Ollama — running locally with a model pulled (ollama pull qwen3.6:27b)
- Gemini CLI — npm install -g @google/gemini-cli && gemini login (enterprise-gated from 2026-06-18)

MCP Tools

Tool	Package	Purpose
`ask-gemini`	@ask-llm/gemini-mcp	Send prompts to Gemini CLI with `@` file syntax. 1M+ token context. Live progressive output via `stream-json`
`ask-gemini-edit`	@ask-llm/gemini-mcp	Get structured OLD/NEW code edit blocks from Gemini
`fetch-chunk`	@ask-llm/gemini-mcp	Retrieve chunks from cached large responses
`ask-codex`	@ask-llm/codex-mcp	Send prompts to Codex CLI. GPT-5.6 Sol with Terra fallback; optional reasoning effort; native session resume via `sessionId`
`ask-claude`	@ask-llm/claude-mcp	Send prompts to Claude Code CLI. Opus with Sonnet fallback; native sessions; Read/Glob/Grep-only workspace access
`ask-ollama`	@ask-llm/ollama-mcp	Send prompts to local Ollama. Fully private, zero cost. Server-side conversation replay via `sessionId`
`ask-antigravity`	@ask-llm/antigravity-mcp	Send a prompt to Google Antigravity (`agy`) for a subscription-backed second opinion. Experimental; one-shot
`ask-llm`	@ask-llm/mcp	Unified orchestrator — pick provider per call. Fan out to all installed providers
`multi-llm`	@ask-llm/mcp	Dispatch the same prompt to multiple providers in parallel; returns per-provider responses + usage in one call
`get-usage-stats`	all	Per-session token totals, fallback counts, breakdowns by provider/model — all in-memory, no persistence
`diagnose`	@ask-llm/mcp	Self-diagnosis: Node version, PATH resolution, provider CLI presence + versions. Read-only
`ping`	all	Connection test — verify MCP setup

All ask-* tools accept an optional sessionId parameter for multi-turn conversations and now return a structured AskResponse (provider, response, model, sessionId, usage) via MCP outputSchema alongside the human-readable text. The orchestrator (@ask-llm/mcp) also exposes usage://current-session as an MCP Resource for live JSON snapshots.

Usage Examples

ask codex to review the changes in src/auth.ts for security issues
ask claude for an independent opinion on this architecture (from Codex or another MCP client)
ask antigravity to debate this architecture plan in docs/design.md
ask ollama to explain src/config.ts (runs locally, no data sent anywhere)
ask gemini to summarize @. the current directory (1M+ context, @ is Gemini-only)
use multi-llm to compare what codex and gemini think about this approach

CLI Subcommands

The orchestrator binary (@ask-llm/mcp) supports two CLI modes alongside the default MCP server:

# Interactive multi-provider REPL — switch providers, persist sessions, see usage live
npx @ask-llm/mcp repl

# Diagnose your setup — Node version, PATH, provider CLI versions, env vars
npx @ask-llm/mcp doctor          # human-readable
npx @ask-llm/mcp doctor --json   # machine-readable, exit 1 on error

The REPL ships sessions per provider (/provider gemini, /provider codex, /new, /sessions, /usage) and inherits all the executor behavior (quota fallback, stream-json output for Gemini, native session resume).

Models

Provider	Default	Fallback
Gemini	`gemini-3.1-pro-preview`	`gemini-3.5-flash` (on quota)
Codex	`gpt-5.6-sol`	`gpt-5.6-terra` (on quota)
Claude	`opus`	`sonnet` (on overload/unavailability)
Ollama	`qwen3.6:27b`	— (local; errors if the model isn't pulled)

Gemini, Codex, and Claude automatically fall back to a lighter model on provider errors. Ollama runs locally and never substitutes a model — if the requested model isn't pulled, it returns a clear ollama pull error.

Documentation

Docs site: lykhoyda.github.io/ask-llm
AI-readable: llms.txt | llms-full.txt

Contributing

Contributions are welcome! See open issues for things to work on.

License

MIT License. See LICENSE for details.

Disclaimer: This is an unofficial, third-party tool and is not affiliated with, endorsed, or sponsored by Anthropic, Google, or OpenAI.

ask-gemini-mcp