Skip to main content
Glama

Consult7 MCP Server

Consult7 is a Model Context Protocol (MCP) server that enables AI agents to consult large context window models via OpenRouter for analyzing extensive file collections - entire codebases, document repositories, or mixed content that exceed the current agent's context limits.

Why Consult7?

Consult7 enables any MCP-compatible agent to offload file analysis to large context models (up to 2M tokens). Useful when:

  • Agent's current context is full

  • Task requires specialized model capabilities

  • Need to analyze large codebases in a single query

  • Want to compare results from different models

"For Claude Code users, Consult7 is a game changer."

How it works

Consult7 collects files from the specific paths you provide (with optional wildcards in filenames), assembles them into a single context, and sends them to a large context window model along with your query. The result is directly fed back to the agent you are working with.

Example Use Cases

Quick codebase summary

  • Files: ["/Users/john/project/src/*.py", "/Users/john/project/lib/*.py"]

  • Query: "Summarize the architecture and main components of this Python project"

  • Model: "google/gemini-3-flash-preview"

  • Mode: "fast"

Deep analysis with reasoning

  • Files: ["/Users/john/webapp/src/*.py", "/Users/john/webapp/auth/*.py", "/Users/john/webapp/api/*.js"]

  • Query: "Analyze the authentication flow across this codebase. Think step by step about security vulnerabilities and suggest improvements"

  • Model: "anthropic/claude-sonnet-4.5"

  • Mode: "think"

Generate a report saved to file

  • Files: ["/Users/john/project/src/*.py", "/Users/john/project/tests/*.py"]

  • Query: "Generate a comprehensive code review report with architecture analysis, code quality assessment, and improvement recommendations"

  • Model: "google/gemini-2.5-pro"

  • Mode: "think"

  • Output File: "/Users/john/reports/code_review.md"

  • Result: Returns "Result has been saved to /Users/john/reports/code_review.md" instead of flooding the agent's context

Consult7 supports Google's Gemini 3 family:

  • Gemini 3 Pro (google/gemini-3-pro-preview) - Flagship reasoning model, 1M context

  • Gemini 3 Flash (google/gemini-3-flash-preview) - Ultra-fast model, 1M context

Quick mnemonics for power users:

  • gemt = Gemini 3 Pro + think (flagship reasoning)

  • gemf = Gemini 3 Flash + fast (ultra fast)

  • gptt = GPT-5.2 + think (latest GPT)

  • grot = Grok 4 + think (alternative reasoning)

  • ULTRA = Run GEMT, GPTT, GROT, and OPUT in parallel (4 frontier models)

These mnemonics make it easy to reference model+mode combinations in your queries.

Installation

Claude Code

Simply run:

claude mcp add -s user consult7 uvx -- consult7 your-openrouter-api-key

Claude Desktop

Add to your Claude Desktop configuration file:

{ "mcpServers": { "consult7": { "type": "stdio", "command": "uvx", "args": ["consult7", "your-openrouter-api-key"] } } }

Replace your-openrouter-api-key with your actual OpenRouter API key.

No installation required - uvx automatically downloads and runs consult7 in an isolated environment.

Command Line Options

uvx consult7 <api-key> [--test]
  • <api-key>: Required. Your OpenRouter API key

  • --test: Optional. Test the API connection

The model and mode are specified when calling the tool, not at startup.

Supported Models

Consult7 supports all 500+ models available on OpenRouter. Below are the flagship models with optimized dynamic file size limits:

Model

Context

Use Case

openai/gpt-5.2

400k

Latest GPT, balanced performance

google/gemini-3-pro-preview

1M

Flagship reasoning model

google/gemini-2.5-pro

1M

Best for complex analysis

google/gemini-3-flash-preview

1M

Gemini 3 Flash, ultra fast

google/gemini-2.5-flash

1M

Fast, good for most tasks

anthropic/claude-sonnet-4.5

1M

Excellent reasoning

anthropic/claude-opus-4.5

200k

Best quality, slower

x-ai/grok-4

256k

Alternative reasoning model

x-ai/grok-4-fast

2M

Largest context window

Quick mnemonics:

  • gptt = openai/gpt-5.2 + think (latest GPT, deep reasoning)

  • gemt = google/gemini-3-pro-preview + think (Gemini 3 Pro, flagship reasoning)

  • grot = x-ai/grok-4 + think (Grok 4, deep reasoning)

  • oput = anthropic/claude-opus-4.5 + think (Claude Opus, deep reasoning)

  • opuf = anthropic/claude-opus-4.5 + fast (Claude Opus, no reasoning)

  • gemf = google/gemini-3-flash-preview + fast (Gemini 3 Flash, ultra fast)

  • ULTRA = call GEMT, GPTT, GROT, and OPUT IN PARALLEL (4 frontier models for maximum insight)

You can use any OpenRouter model ID (e.g., deepseek/deepseek-r1-0528). See the full model list. File size limits are automatically calculated based on each model's context window.

Performance Modes

  • fast: No reasoning - quick answers, simple tasks

  • mid: Moderate reasoning - code reviews, bug analysis

  • think: Maximum reasoning - security audits, complex refactoring

File Specification Rules

  • Absolute paths only: /Users/john/project/src/*.py

  • Wildcards in filenames only: /Users/john/project/*.py (not in directory paths)

  • Extension required with wildcards: *.py not *

  • Mix files and patterns: ["/path/src/*.py", "/path/README.md", "/path/tests/*_test.py"]

Common patterns:

  • All Python files: /path/to/dir/*.py

  • Test files: /path/to/tests/*_test.py or /path/to/tests/test_*.py

  • Multiple extensions: ["/path/*.js", "/path/*.ts"]

Automatically ignored: __pycache__, .env, secrets.py, .DS_Store, .git, node_modules

Size limits: Dynamic based on model context window (e.g., Grok 4 Fast: ~8MB, GPT-5.2: ~1.5MB)

Tool Parameters

The consultation tool accepts the following parameters:

  • files (required): List of absolute file paths or patterns with wildcards in filenames only

  • query (required): Your question or instruction for the LLM to process the files

  • model (required): The LLM model to use (see Supported Models above)

  • mode (required): Performance mode - fast, mid, or think

  • output_file (optional): Absolute path to save the response to a file instead of returning it

    • If the file exists, it will be saved with _updated suffix (e.g., report.mdreport_updated.md)

    • When specified, returns only: "Result has been saved to /path/to/file"

    • Useful for generating reports, documentation, or analyses without flooding the agent's context

  • zdr (optional): Enable Zero Data Retention routing (default: false)

    • When true, routes only to endpoints with ZDR policy (prompts not retained by provider)

    • ZDR available: Gemini 3 Pro/Flash, Claude Opus 4.5, GPT-5

    • Not available: GPT-5.2, Grok 4 (returns error)

Usage Examples

Via MCP in Claude Code

Claude Code will automatically use the tool with proper parameters:

{ "files": ["/Users/john/project/src/*.py"], "query": "Explain the main architecture", "model": "google/gemini-3-flash-preview", "mode": "fast" }

Via Python API

from consult7.consultation import consultation_impl result = await consultation_impl( files=["/path/to/file.py"], query="Explain this code", model="google/gemini-3-flash-preview", mode="fast", # fast, mid, or think provider="openrouter", api_key="sk-or-v1-..." )

Testing

# Test OpenRouter connection uvx consult7 sk-or-v1-your-api-key --test

Uninstalling

To remove consult7 from Claude Code:

claude mcp remove consult7 -s user

Version History

v3.3.0

  • Fixed GPT-5.2 thinking mode truncation issue (switched to streaming)

  • Added google/gemini-3-flash-preview (Gemini 3 Flash, ultra fast)

  • Updated gemf mnemonic to use Gemini 3 Flash

  • Added zdr parameter for Zero Data Retention routing

v3.2.0

  • Updated to GPT-5.2 with effort-based reasoning

v3.1.0

  • Added google/gemini-3-pro-preview (1M context, flagship reasoning model)

  • New mnemonics: gemt (Gemini 3 Pro), grot (Grok 4), ULTRA (parallel execution)

v3.0.0

  • Removed Google and OpenAI direct providers - now OpenRouter only

  • Removed |thinking suffix - use mode parameter instead (now required)

  • Clean mode parameter API: fast, mid, think

  • Simplified CLI from consult7 <provider> <key> to consult7 <key>

  • Better MCP integration with enum validation for modes

  • Dynamic file size limits based on model context window

v2.1.0

  • Added output_file parameter to save responses to files

v2.0.0

  • New file list interface with simplified validation

  • Reduced file size limits to realistic values

License

MIT

-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/szeider/consult7'

If you have feedback or need assistance with the MCP directory API, please join our Discord server