Deep Research MCP
Integrates with Ollama's local inference server via an OpenAI-compatible endpoint for running research models locally.
Integrates with OpenAI's Responses API for web search and code interpreter, or Chat Completions API for broad provider compatibility, enabling deep research tasks.
Integrates with Perplexity's Sonar Deep Research model via the OpenAI-compatible Chat Completions endpoint for web-connected research.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Deep Research MCPresearch the history and current state of nuclear fusion energy"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Deep Research MCP
A Python-based agent that integrates research providers with Claude Code through the Model Context Protocol (MCP). It supports OpenAI (Responses API with web search and code interpreter, or Chat Completions API for broad provider compatibility), Gemini Deep Research via the Interactions API, Allen AI's DR-Tulu research agent, and the open-source Open Deep Research stack (based on smolagents).
Prerequisites
Python 3.11+
uv installed
One of:
OpenAI API access (Responses API models, e.g.,
o4-mini-deep-research-2025-06-26)Gemini API access with the Interactions API / Deep Research agent enabled
DR-Tulu service running locally or remotely (see DR-Tulu setup)
Open Deep Research dependencies (installed via
uv sync --extra open-deep-research)
Claude Code, or any other assistant supporting MCP integration
Installation
Recommended setup (resolves the latest compatible versions):
# Install runtime dependencies + project in editable mode
uv sync --upgrade
# Development tooling (pytest, black, pylint, mypy, pre-commit)
uv sync --upgrade --extra dev
# Enable the pre-commit hook so black runs automatically before each commit
uv run pre-commit install
# Optional docs tooling
uv sync --upgrade --extra docs
# Optional Open Deep Research provider dependencies
uv sync --upgrade --extra open-deep-researchCompatibility setup (pip-based):
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -e .Code Layout
src/deep_research_mcp/agent.py: orchestration layer; owns clarification, instruction building, callbacks, and delegates provider work to backendssrc/deep_research_mcp/backends/: provider-specific implementations for OpenAI, Gemini, DR-Tulu, and Open Deep Researchsrc/deep_research_mcp/mcp_server.py: FastMCP server and tool entrypointssrc/deep_research_mcp/clarification.py: clarification agents, sessions, and enrichment flowsrc/deep_research_mcp/prompts/: YAML prompt templates used by clarification and instruction buildingcli/deep-research-cli.py: unified CLI for agent mode, MCP client mode, and configuration viewingcli/deep-research-tui.py: interactive full-screen terminal UI for clarification, research, status checks, and saving output to disktests/:pytestsuite covering configuration, MCP integration, prompts, results, and clarification flows
Configuration
Configuration File
Create a ~/.deep_research file in your home directory using TOML format.
Library note: ResearchConfig.load() explicitly reads this file and applies environment variable overrides. ResearchConfig.from_env() reads environment variables only.
Common settings:
[research] # Core Deep Research functionality
provider = "openai" # Available options: "openai", "dr-tulu", "gemini", "open-deep-research" -- defaults to "openai"
api_style = "responses" # Only applies to provider="openai"; use "chat_completions" for Perplexity, Groq, Ollama, etc.
model = "o4-mini-deep-research-2025-06-26" # OpenAI: model identifier; Dr Tulu: logical provider id; Gemini: agent id; ODR: LiteLLM model identifier
api_key = "your-api-key" # API key, optional
base_url = "https://api.openai.com/v1" # OpenAI: OpenAI-compatible endpoint; Dr Tulu: service base URL; Gemini: https://generativelanguage.googleapis.com; ODR: LiteLLM-compatible endpoint
# Task behavior
timeout = 1800
poll_interval = 30
# Largely based on https://cookbook.openai.com/examples/deep_research_api/introduction_to_deep_research_api_agents
[clarification] # Optional query clarification component
enable = true
triage_model = "gpt-5-mini"
clarifier_model = "gpt-5-mini"
instruction_builder_model = "gpt-5-mini"
api_key = "YOUR_OPENAI_API_KEY" # Optional, overrides api_key
base_url = "https://api.openai.com/v1" # Optional, overrides base_url
[logging]
level = "INFO"OpenAI provider example:
[research]
provider = "openai"
model = "o4-mini-deep-research-2025-06-26" # OpenAI model
api_key = "YOUR_OPENAI_API_KEY" # Defaults to OPENAI_API_KEY
base_url = "https://api.openai.com/v1" # OpenAI-compatible endpoint
timeout = 1800
poll_interval = 30Gemini Deep Research provider example:
[research]
provider = "gemini"
model = "deep-research-preview-04-2026" # Gemini Deep Research agent id
api_key = "YOUR_GEMINI_API_KEY" # Defaults to GEMINI_API_KEY or GOOGLE_API_KEY
base_url = "https://generativelanguage.googleapis.com"
timeout = 1800
poll_interval = 30Dr Tulu provider example:
[research]
provider = "dr-tulu"
model = "dr-tulu" # Logical provider model id; currently informational
base_url = "http://localhost:8080/" # Dr Tulu service base URL; the backend calls /chat
api_key = "" # Optional; defaults to RESEARCH_API_KEY / DR_TULU_API_KEY if set
timeout = 1800
poll_interval = 30Running Dr Tulu locally:
Clone and configure
dr-tuluseparately.In
dr-tulu/agent/.env, set at least:SERPER_API_KEYJINA_API_KEYS2_API_KEY(optional but recommended)
Start the DR-Tulu model server:
cd /path/to/dr-tulu/agent
conda run -n vllm bash -lc '
CUDA_VISIBLE_DEVICES=0 \
vllm serve rl-research/DR-Tulu-8B \
--port 30001 \
--dtype auto \
--max-model-len 16384 \
--gpu-memory-utilization 0.60 \
--enforce-eager
'Start the Dr Tulu MCP backend:
cd /path/to/dr-tulu/agent
conda run -n vllm python -m dr_agent.mcp_backend.main --port 8000Start the Dr Tulu app service:
cd /path/to/dr-tulu/agent
conda run -n vllm python workflows/auto_search_sft.py serve \
--port 8080 \
--config workflows/auto_search_sft.yaml \
--config-overrides "search_agent_max_tokens=12000,browse_agent_max_tokens=12000"Point
deep-research-mcpat that service:
[research]
provider = "dr-tulu"
base_url = "http://localhost:8080/"
timeout = 1800The dr-tulu backend calls POST {base_url}/chat, so if you front Dr Tulu behind a different host, port, or reverse proxy, update base_url accordingly.
Perplexity (via Sonar Deep Research and Perplexity's OpenAI-compatible endpoint) provider example:
[research]
provider = "openai"
api_style = "chat_completions" # Required for Perplexity (no Responses API)
model = "sonar-deep-research" # Perplexity's Sonar Deep Research
api_key = "ppl-..." # Defaults to OPENAI_API_KEY
base_url = "https://api.perplexity.ai" # Perplexity's OpenAI-compatible endpoint
timeout = 1800Open Deep Research provider example:
[research]
provider = "open-deep-research"
model = "openai/qwen/qwen3-coder-30b" # LiteLLM-compatible model id
base_url = "http://localhost:1234/v1" # LiteLLM-compatible endpoint (local or remote)
api_key = "" # Optional if endpoint requires it
timeout = 1800Ollama (local) provider example:
[research]
provider = "openai"
api_style = "chat_completions"
model = "llama3.1" # Any model available in your Ollama instance
base_url = "http://localhost:11434/v1" # Ollama's OpenAI-compatible endpoint
api_key = "" # Not required for local Ollama
timeout = 600llama-server (local llama.cpp server) provider example:
[research]
provider = "openai"
api_style = "chat_completions"
model = "qwen2.5-0.5b" # Must match the --alias passed to llama-server
base_url = "http://127.0.0.1:8081/v1" # llama-server OpenAI-compatible endpoint
api_key = "test" # Must match the --api-key passed to llama-server
timeout = 600Generic OpenAI-compatible Chat Completions provider (Groq, Together AI, vLLM, etc.):
[research]
provider = "openai"
api_style = "chat_completions"
model = "your-model-name"
api_key = "your-api-key"
base_url = "https://api.your-provider.com/v1"
timeout = 600Optional env variables for Open Deep Research tools:
SERPAPI_API_KEYorSERPER_API_KEY: enable Google-style searchHF_TOKEN: optional, logs into Hugging Face Hub for gated models
Install The Included Skill In Claude Code Or Codex
This repository also ships a repo-specific skill guide at
skills/deep-research-mcp/SKILL.md.
Installing that file as a local skill gives Claude Code or Codex a focused
playbook for this repository's CLI, Python API, providers, and MCP server.
It does not install Python dependencies, start the MCP server, or replace
the provider configuration in ~/.deep_research. Treat it as complementary to
the MCP setup below, not a substitute for it.
Claude Code skill install
Claude Code's skills docs use these locations:
personal skill:
~/.claude/skills/<skill-name>/SKILL.mdproject skill:
.claude/skills/<skill-name>/SKILL.md
Personal install:
mkdir -p ~/.claude/skills/deep-research-mcp
cp /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.claude/skills/deep-research-mcp/SKILL.mdIf you prefer to keep the installed skill linked to this checkout:
mkdir -p ~/.claude/skills/deep-research-mcp
ln -sf /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.claude/skills/deep-research-mcp/SKILL.mdProject-local install from the repository root:
mkdir -p .claude/skills/deep-research-mcp
cp skills/deep-research-mcp/SKILL.md .claude/skills/deep-research-mcp/SKILL.mdAfter that, restart Claude Code or open a new session if the skill does not
appear immediately. The skill can be invoked directly as
/deep-research-mcp, and Claude can also load it automatically when the task
matches the skill description. For a reusable shared extension, package the
same skill into a Claude Code plugin instead of copying it by hand.
Official docs:
Claude Code skills: https://docs.claude.com/en/docs/claude-code/skills
Claude Code plugins: https://docs.claude.com/en/docs/claude-code/plugins
Codex skill install
Current Codex docs describe skills as skill folders containing SKILL.md,
stored globally in $HOME/.agents/skills or repo-locally in .agents/skills.
Personal install:
mkdir -p ~/.agents/skills/deep-research-mcp
cp /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.agents/skills/deep-research-mcp/SKILL.mdOr keep the installed skill linked to this checkout:
mkdir -p ~/.agents/skills/deep-research-mcp
ln -sf /path/to/deep-research-mcp/skills/deep-research-mcp/SKILL.md \
~/.agents/skills/deep-research-mcp/SKILL.mdProject-local install from the repository root:
mkdir -p .agents/skills/deep-research-mcp
cp skills/deep-research-mcp/SKILL.md .agents/skills/deep-research-mcp/SKILL.mdIf your Codex setup still follows older CLI conventions that use
~/.codex/skills or .codex/skills, mirror the same directory structure
there instead.
Codex can invoke the skill implicitly when the task matches its description, or
explicitly by mentioning $deep-research-mcp in the prompt. If the skill does
not show up immediately, restart Codex.
Official docs:
Codex skills: https://developers.openai.com/codex/skills
Codex customization and skill locations: https://developers.openai.com/codex/concepts/customization#skills
Claude Code Integration
Configure MCP Server
Choose one of the transports below.
Option A: stdio (recommended when Claude Code should spawn the server itself)
If your provider credentials are already stored in ~/.deep_research, the
minimal setup is:
claude mcp add deep-research -- uv run --directory /path/to/deep-research-mcp deep-research-mcpIf you want Claude Code to pass OPENAI_API_KEY through to the spawned MCP
process explicitly, use:
claude mcp add -e OPENAI_API_KEY="$OPENAI_API_KEY" \
deep-research -- \
uv run --directory /path/to/deep-research-mcp deep-research-mcpOption B: HTTP (recommended when you want to run the server separately)
Start the server in one terminal:
OPENAI_API_KEY="$OPENAI_API_KEY" \
uv run --directory /path/to/deep-research-mcp \
deep-research-mcp --transport http --host 127.0.0.1 --port 8080Then add the HTTP MCP server in Claude Code:
claude mcp add --transport http deep-research-http http://127.0.0.1:8080/mcpReplace /path/to/deep-research-mcp/ with the actual path to your cloned repository.
The verified Streamable HTTP endpoint is http://127.0.0.1:8080/mcp.
For multi-hour research, raise Claude Code's tool timeout before launching the CLI and rely on incremental status polls:
export MCP_TOOL_TIMEOUT=14400000 # 4 hours
claude --mcp-config ./.mcp.jsonKick off work with deep_research or research_with_context, note the returned job ID, and call research_status to stream progress without letting any single tool call stagnate.
Use in Claude Code:
The research tools will appear in Claude Code's tool palette
Simply ask Claude to "research [your topic]" and it will use the Deep Research agent
For clarified research, ask Claude to "research [topic] with clarification" to get follow-up questions
OpenAI Codex Integration
Configure MCP Server
Choose one of the transports below.
Option A: stdio (recommended when Codex should spawn the server itself)
Add the MCP server configuration to your ~/.codex/config.toml file:
[mcp_servers.deep-research]
command = "uv"
args = ["run", "--directory", "/path/to/deep-research-mcp", "deep-research-mcp"]
# If your provider credentials live in shell env vars rather than ~/.deep_research,
# pass them through to the MCP subprocess explicitly:
env_vars = ["OPENAI_API_KEY"]
startup_timeout_ms = 30000 # 30 seconds for server startup
request_timeout_ms = 7200000 # 2 hours for long-running research tasks
# Alternatively, set tool_timeout_sec when using newer Codex clients
# tool_timeout_sec = 14400.0 # 4 hours for deep research runsReplace /path/to/deep-research-mcp/ with the actual path to your cloned repository.
If your credentials are already configured in ~/.deep_research, env_vars is
optional. It is required when you expect the spawned MCP server to inherit
OPENAI_API_KEY from the parent shell.
Option B: HTTP (recommended when you want to run the server separately)
Start the server in one terminal:
OPENAI_API_KEY="$OPENAI_API_KEY" \
uv run --directory /path/to/deep-research-mcp \
deep-research-mcp --transport http --host 127.0.0.1 --port 8080Then add this to ~/.codex/config.toml:
[mcp_servers.deep-research-http]
url = "http://127.0.0.1:8080/mcp"
tool_timeout_sec = 14400.0The verified Streamable HTTP endpoint is http://127.0.0.1:8080/mcp.
Important timeout configuration:
startup_timeout_ms: Time allowed for the MCP server to start (default: 30000ms / 30 seconds)request_timeout_ms: Maximum time for research queries to complete (recommended: 7200000ms / 2 hours for comprehensive research)tool_timeout_sec: Preferred for newer Codex clients; set this to a large value (e.g.,14400.0for 4 hours) when you expect long-running research.Kick off research once to capture the job ID, then poll
research_statusso each tool call remains short and avoids hitting client timeouts.
Without proper timeout configuration, long-running research queries may fail with "request timed out" errors.
Use in OpenAI Codex:
The research tools will be available automatically when you start Codex
Ask Codex to "research [your topic]" and it will use the Deep Research MCP server
For clarified research, ask for "research [topic] with clarification"
Gemini CLI Integration
Configure MCP Server
Add the MCP server using Gemini CLI's built-in command:
gemini mcp add deep-research -- uv run --directory /path/to/deep-research-mcp deep-research-mcpOr manually add to your ~/.gemini/settings.json file:
{
"mcpServers": {
"deep-research": {
"command": "uv",
"args": ["run", "--directory", "/path/to/deep-research-mcp", "deep-research-mcp"],
"env": {
"RESEARCH_PROVIDER": "gemini",
"GEMINI_API_KEY": "$GEMINI_API_KEY"
}
}
}
}Replace /path/to/deep-research-mcp/ with the actual path to your cloned repository.
Use in Gemini CLI:
Start Gemini CLI with
geminiThe research tools will be available automatically
Ask Gemini to "research [your topic]" and it will use the Deep Research MCP server
Use
/mcpcommand to view server status and available tools
HTTP transport: If your Gemini environment supports MCP-over-HTTP, you may run
the server with --transport http and configure Gemini with the server URL.
Usage
As a Standalone Python Module
import asyncio
from deep_research_mcp.agent import DeepResearchAgent
from deep_research_mcp.config import ResearchConfig
async def main():
# Initialize configuration
config = ResearchConfig.load()
# Create agent
agent = DeepResearchAgent(config)
# Perform research
result = await agent.research(
query="What are the latest advances in quantum computing?",
system_prompt="Focus on practical applications and recent breakthroughs"
)
# Print results
print(f"Report: {result.final_report}")
print(f"Citations: {result.citations}")
print(f"Research steps: {result.reasoning_steps}")
print(f"Execution time: {result.execution_time:.2f}s")
# Run the research
asyncio.run(main())As an MCP Server
Two transports are supported: stdio (default) and HTTP streaming.
# 1) stdio (default) — for editors/CLIs that spawn a local process
uv run deep-research-mcp
# 2) HTTP streaming — start a local HTTP MCP server
uv run deep-research-mcp --transport http --host 127.0.0.1 --port 8080Notes:
HTTP mode uses streaming responses provided by FastMCP. The tools in this server return their full results when a research task completes; streaming is still beneficial for compatible clients and for future incremental outputs.
The verified Streamable HTTP endpoint is
/mcp, so the default local URL ishttp://127.0.0.1:8080/mcp.If you start the server outside the client and rely on environment variables for credentials, export them before launching the server process. If you use
stdioand let the client spawn the server, make sure the client passes the required env vars through.
Command-Line Interface
The unified CLI at cli/deep-research-cli.py provides direct access to all
research functionality from the terminal. It supports two modes of operation:
agent mode (default) which runs DeepResearchAgent directly, and MCP
client mode which connects to a running MCP server over HTTP.
Configuration is loaded from ~/.deep_research by default. Every
ResearchConfig parameter can be overridden via CLI flags, which take
precedence over both the TOML file and environment variables.
Terminal UI
The repository also ships with a full-screen terminal UI at
cli/deep-research-tui.py. It presents the same core functionality as the
CLI in a dark, keyboard-driven interface for running clarification, deep
research, task status checks, and saving output to disk.

The TUI features a split-panel layout:
Left panel: Configuration controls for mode selection (Agent/MCP), provider settings, model configuration, query input, and system prompt
Right panel: Output display showing research results, clarification questions, or status information
The animation above demonstrates the TUI workflow: selecting the Chat Completions API style, entering a research query about nuclear fusion, running the research, viewing results in the output panel, and saving the output to file.
Quick Start
# Start in direct agent mode
uv run python cli/deep-research-tui.py
# Start in MCP client mode
uv run python cli/deep-research-tui.py --mode mcp \
--server-url http://127.0.0.1:8080/mcp
# Start with Gemini selected
uv run python cli/deep-research-tui.py --provider geminiTUI Workflow
Use the left control panel to edit provider settings, query text, system prompt, and save path.
The TUI starts focused on the
Modeselector rather than inside the query editor.Use
Tab/Shift+Tabto move through all controls.Use
Up/Downto move between non-editor controls, including single-line inputs.Use
Left/Rightto toggle booleans and cycle through choice fields when aSwitchorSelecthas focus.Press
Enterto activate buttons, toggle switches, cycle selects, or move forward from a single-line input.TextAreawidgets such asQueryandSystem Promptkeep normal cursor-key editing behavior.Use
cto run clarification,rto run deep research,tto check task status,sto save the current output, andqto quit.The right panel shows the latest clarification output, research report, or status response.
Provider Defaults
In agent mode, the TUI applies provider-aware defaults:
openai+responses: modelo4-mini-deep-research-2025-06-26, base URLhttps://api.openai.com/v1openai+chat_completions: modelgpt-5-mini, base URLhttps://api.openai.com/v1dr-tulu: modeldr-tulu, base URLhttp://localhost:8080/gemini: modeldeep-research-preview-04-2026, base URLhttps://generativelanguage.googleapis.comopen-deep-research: modelopenai/qwen/qwen3-coder-30b, base URLhttp://localhost:1234/v1
Switching provider or OpenAI API style automatically refreshes the model and base URL defaults. You can still override those fields manually afterward.
Clarification, Research, and Saving Output
In
agentmode,Run Clarificationcalls the clarification flow directly throughDeepResearchAgent.In
mcpmode,Run Clarificationcalls the MCPdeep_researchtool withrequest_clarification=true.If clarification questions are returned, the TUI adds answer fields dynamically and uses those answers on the next research run.
Run Deep Researchexecutes either the direct agent flow or the MCP client flow, depending on the selected mode.Save Outputwrites the current contents of the output panel to the configured path, creating parent directories if needed.
Notes
In
mcpmode, setMCP Server URLto a running Streamable HTTP endpoint such ashttp://127.0.0.1:8080/mcp.The TUI reuses the same config-loading behavior as
cli/deep-research-cli.py, so~/.deep_researchand any startup overrides still apply.Direct-agent JSON rendering is available through the
JSON Outputtoggle; MCP mode saves the textual tool response exactly as returned by the server.The current layout is designed for terminals at least 100 columns wide and 28 rows tall.
Quick Start
Gemini Deep Research:
uv run python cli/deep-research-cli.py \
--provider gemini \
--model deep-research-preview-04-2026 \
--base-url https://generativelanguage.googleapis.com \
research "What is the capital of France?"Expected output (snippet):
============================================================
RESEARCH REPORT
============================================================
Task ID: v1_Chd5YUxjYVpXRUotYWxrZFVQOF9lcTRRVRIX...
Total steps: 1
Execution time: 245.94s
# Comprehensive Analysis of the French Capital: Demographic,
# Economic, and Historical Dimensions of Paris
**Key Points**
* **The capital of France is Paris**, acting as the undisputed
political, economic, and cultural epicenter of the nation.
* Data indicates that the Paris metropolitan area commands a
Gross Domestic Product (GDP) exceeding $1.03 trillion ...OpenAI o4-mini-deep-research:
uv run python cli/deep-research-cli.py \
--provider openai \
--model o4-mini-deep-research-2025-06-26 \
--base-url https://api.openai.com/v1 \
research "What is the capital of France?"Expected output:
============================================================
RESEARCH REPORT
============================================================
Task ID: resp_0e55abdae3d3ce390069dca3c7d78c819f91...
Total steps: 22
Search queries: 10
Citations: 1
Execution time: 34.34s
The capital of France is **Paris**
(www.britannica.com/place/Paris)
============================================================
CITATIONS
============================================================
1. Paris | Definition, Map, Population, Facts, & History
https://www.britannica.com/place/ParisDR-Tulu (requires a running dr-tulu service; see Dr Tulu provider example):
uv run python cli/deep-research-cli.py \
--provider dr-tulu \
--base-url http://localhost:8080/ \
research "What is the capital of France?"Expected output:
============================================================
RESEARCH REPORT
============================================================
Task ID: 7f3a1b2c-...
Total steps: 4
Citations: 2
Execution time: 18.72s
The capital of France is Paris. Paris has served as the
French capital since the late 10th century ...
============================================================
CITATIONS
============================================================
1. Source 1
https://en.wikipedia.org/wiki/Paris
2. Source 2
https://www.britannica.com/place/ParisView resolved configuration or all available options:
uv run python cli/deep-research-cli.py config --pretty
uv run python cli/deep-research-cli.py --helpCommands
research QUERY -- perform deep research on a query.
# Simple research (agent mode)
uv run python cli/deep-research-cli.py research "Economic impact of AI adoption"
# Override provider and model for a single run
uv run python cli/deep-research-cli.py --provider gemini research "Climate change policies"
# Use a custom system prompt from a file
uv run python cli/deep-research-cli.py research "Healthcare trends" --system-prompt-file prompts/health.txt
# Or pass the system prompt inline
uv run python cli/deep-research-cli.py research "Healthcare trends" --system-prompt "Focus on peer-reviewed sources only"
# Output as JSON (includes metadata, citations, execution time)
uv run python cli/deep-research-cli.py research "AI safety" --json
# Save the report to a file
uv run python cli/deep-research-cli.py research "Renewable energy" --output-file report.md
# Disable code interpreter / data analysis tools
uv run python cli/deep-research-cli.py research "Simple topic" --no-analysis
# Notify a webhook when research completes
uv run python cli/deep-research-cli.py research "Long query" --callback-url https://example.com/webhookTested local OpenAI-compatible backends
The unified CLI works with local servers that expose an OpenAI-compatible
Chat Completions API. The commands below were tested against local Ollama and
local llama-server (from llama.cpp).
Ollama
Basic research flow:
uv run python cli/deep-research-cli.py \
--provider openai \
--api-style chat_completions \
--base-url http://localhost:11434/v1 \
--api-key test \
--model qwen3.5:0.8b \
--timeout 180 \
research "Reply with exactly: ok"Observed output:
HTTP Request: POST http://localhost:11434/v1/chat/completions "HTTP/1.1 200 OK"
============================================================
RESEARCH REPORT
============================================================
Task ID: chatcmpl-215
Total steps: 1
Execution time: 14.33s
okInteractive clarification with Ollama needs the clarification models pinned to
the same local endpoint. In testing, qwen3.5:4b worked for clarification,
while qwen3.5:0.8b was too small to reliably satisfy the structured triage
step.
uv run python cli/deep-research-cli.py \
--provider openai \
--api-style chat_completions \
--base-url http://localhost:11434/v1 \
--api-key test \
--model qwen3.5:0.8b \
--clarification-base-url http://localhost:11434/v1 \
--clarification-api-key test \
--triage-model qwen3.5:4b \
--clarifier-model qwen3.5:4b \
--instruction-builder-model qwen3.5:4b \
--timeout 180 \
research "best laptop" --clarifyObserved interaction:
Starting clarification process...
Please answer the following clarifying questions:
1. What is your budget range?
Your answer (or press Enter to skip): Under $1500
2. What will you primarily use the laptop for (gaming, work, students, creative tasks)?
Your answer (or press Enter to skip): Programming and general work
3. Do you have a preferred operating system (macOS, Windows,)?
Your answer (or press Enter to skip): macOS preferred, Windows acceptable
...
Enriched query: What are the best laptops under $1500 for professional programming work, preferably macOS with 16GB RAM, 512GB SSD, 13-15 inch display, and good battery life?llama-server
Start a local OpenAI-compatible server and let it download a small GGUF model from Hugging Face automatically:
llama-server \
--host 127.0.0.1 \
--port 8081 \
--api-key test \
--ctx-size 4096 \
--alias qwen2.5-0.5b \
-hf Qwen/Qwen2.5-0.5B-Instruct-GGUF:Q4_K_MThen point the CLI at the server:
uv run python cli/deep-research-cli.py \
--provider openai \
--api-style chat_completions \
--base-url http://127.0.0.1:8081/v1 \
--api-key test \
--model qwen2.5-0.5b \
--timeout 120 \
research "Reply with exactly: ok"Observed output:
HTTP Request: POST http://127.0.0.1:8081/v1/chat/completions "HTTP/1.1 200 OK"
============================================================
RESEARCH REPORT
============================================================
Task ID: chatcmpl-uzqSNDzRgYchZRxn1Kq2MNYyc2w6hpc6
Total steps: 1
Execution time: 0.15s
Ok--clarify is model-sensitive on llama-server. Small tested models
(qwen2.5-0.5b and qwen2.5-3b) completed the basic research command but
did not reliably ask follow-up questions. Example output with
qwen2.5-3b:
Starting clarification process...
Triage assessment: The query is focused on finding the best laptop, which is a common and specific research topic.
Assessment: The query 'best laptop' is clear and specific enough for direct research.
Proceeding with original queryresearch QUERY --clarify -- interactive clarification before research.
The --clarify flag runs an interactive clarification flow: the agent
analyzes your query, asks follow-up questions to improve specificity, and
then performs research using an enriched query. This works in both agent mode
and MCP client mode.
In agent mode, --clarify automatically enables the clarification pipeline
regardless of the enable_clarification setting in your config file.
# Interactive clarification (agent mode)
uv run python cli/deep-research-cli.py research "Quantum computing applications" --clarify
# Interactive clarification (MCP client mode)
uv run python cli/deep-research-cli.py research "Quantum computing" --clarify \
--server-url http://localhost:8080/mcpExample session:
Starting clarification process...
Please answer the following clarifying questions:
1. Are you interested in near-term applications or long-term theoretical possibilities?
Your answer (or press Enter to skip): Near-term commercial applications
2. Which industries are you most interested in?
Your answer (or press Enter to skip): Finance and pharmaceuticals
3. Should the report focus on specific hardware platforms?
Your answer (or press Enter to skip):
Enriched query: Quantum computing applications in finance and pharmaceuticals...
Starting research with query: '...'research QUERY --server-url URL -- use MCP client mode.
Instead of running the agent directly, connect to a running Deep Research MCP server over Streamable HTTP.
# First, start the MCP server in another terminal:
uv run deep-research-mcp --transport http --host 127.0.0.1 --port 8080
# Then run queries against it:
uv run python cli/deep-research-cli.py research "AI trends" \
--server-url http://127.0.0.1:8080/mcpstatus TASK_ID -- check the status of a running research task.
# Agent mode
uv run python cli/deep-research-cli.py status abc123-def456-ghi789
# MCP client mode
uv run python cli/deep-research-cli.py status abc123-def456-ghi789 \
--server-url http://127.0.0.1:8080/mcpconfig -- display the resolved configuration.
Shows the final configuration after merging the TOML file, environment variables, and any CLI overrides.
# JSON output (default), with secrets masked
uv run python cli/deep-research-cli.py config
# Human-readable output
uv run python cli/deep-research-cli.py config --pretty
# Show full API keys
uv run python cli/deep-research-cli.py config --pretty --show-secrets
# See the effect of CLI overrides
uv run python cli/deep-research-cli.py --provider gemini --timeout 600 config --pretty
# Skip config validation
uv run python cli/deep-research-cli.py config --no-validateConfiguration Overrides
All global flags are placed before the subcommand and override the
corresponding ResearchConfig field:
Flag | Description |
| Path to TOML config file (default: |
| Research provider |
| Model or agent ID |
| Provider API key |
| Provider API base URL |
| OpenAI API style |
| Max research timeout |
| Task poll interval |
| Logging level |
| Enable the clarification pipeline |
| Enable reasoning summaries |
| Model for query triage |
| Model for query enrichment |
| Base URL for clarification models |
| API key for clarification models |
| Model for instruction building |
Configuration precedence (highest to lowest): CLI flags > environment
variables > TOML config file (~/.deep_research) > built-in defaults.
Exit Codes
Code | Meaning |
0 | Success |
1 | Research or configuration error |
2 | MCP tool error |
3 | Unexpected error |
Example Queries
# Basic research query
result = await agent.research("Explain the transformer architecture in AI")
# Research with code analysis
result = await agent.research(
query="Analyze global temperature trends over the last 50 years",
include_code_interpreter=True
)
# Custom system instructions
result = await agent.research(
query="Review the safety considerations for AGI development",
system_prompt="""
Provide a balanced analysis including:
- Technical challenges
- Current safety research
- Regulatory approaches
- Industry perspectives
Include specific examples and data where available.
"""
)
# With clarification (requires ENABLE_CLARIFICATION=true)
clarification_result = agent.start_clarification("quantum computing applications")
if clarification_result.get("needs_clarification"):
# Answer questions programmatically or present to user
answers = ["Hardware applications", "Last 5 years", "Commercial products"]
agent.add_clarification_answers(clarification_result["session_id"], answers)
enriched_query = agent.get_enriched_query(clarification_result["session_id"])
result = await agent.research(enriched_query)Clarification Features
The agent includes an optional clarification system to improve research quality through follow-up questions.
Configuration
Enable clarification in your ~/.deep_research file:
[clarification]
enable_clarification = true
triage_model = "gpt-5-mini" # Optional, defaults to gpt-5-mini
clarifier_model = "gpt-5-mini" # Optional, defaults to gpt-5-mini
instruction_builder_model = "gpt-5-mini" # Optional, defaults to gpt-5-mini
clarification_api_key = "YOUR_CLARIFICATION_API_KEY" # Optional custom API key for clarification models
clarification_base_url = "https://custom-api.example.com/v1" # Optional custom endpoint for clarification modelsClarification and instruction-building remain OpenAI-compatible chat flows. If your main research provider is dr-tulu, gemini, or open-deep-research, set clarification_api_key / clarification_base_url explicitly, or provide OPENAI_API_KEY / OPENAI_BASE_URL in the environment for those helper models.
Usage Flow
Start Clarification:
result = agent.start_clarification("your research query")Check if Questions are Needed:
if result.get("needs_clarification"): questions = result["questions"] session_id = result["session_id"]Provide Answers:
answers = ["answer1", "answer2", "answer3"] agent.add_clarification_answers(session_id, answers)Get Enriched Query:
enriched_query = agent.get_enriched_query(session_id) final_result = await agent.research(enriched_query)
Integration with AI Assistants
When using with AI Assistants via MCP tools:
Request Clarification: Use
deep_research()withrequest_clarification=TrueAnswer Questions: The AI Assistant will present questions to you
Deep Research: The AI Asssitant will automatically use
research_with_context()with your answers
API Reference
DeepResearchAgent
The main class for performing research operations.
Methods
research(query, system_prompt=None, include_code_interpreter=True, callback_url=None)Performs deep research on a query
callback_url: optional webhook notified when research completesReturns: Dictionary with final report, citations, and metadata
get_task_status(task_id)Check the status of a research task
Returns: Task status information
start_clarification(query)Analyze query and generate clarifying questions if needed
Returns: Dictionary with questions and session ID
add_clarification_answers(session_id, answers)Add answers to clarification questions
Returns: Session status information
get_enriched_query(session_id)Generate enriched query from clarification session
Returns: Enhanced query string
ResearchConfig
Configuration class for the research agent.
Parameters
provider: Research provider (openai,dr-tulu,gemini, oropen-deep-research; default:openai)api_style: API style for theopenaiprovider (responsesorchat_completions; default:responses). Ignored fordr-tulu,gemini, andopen-deep-research.model: Model identifierOpenAI: Responses model (e.g.,
gpt-5-mini)Dr Tulu: logical provider id (default:
dr-tulu)Gemini: Deep Research agent id (for example
deep-research-preview-04-2026)Open Deep Research: LiteLLM model id (e.g.,
openai/qwen/qwen3-coder-30b)
api_key: API key for the configured endpoint (optional). Defaults to envOPENAI_API_KEYforopenai,DR_TULU_API_KEYfordr-tulu,GEMINI_API_KEY/GOOGLE_API_KEYforgemini.base_url: Provider API base URL (optional). Defaults tohttps://api.openai.com/v1foropenai,http://localhost:8080/fordr-tulu,https://generativelanguage.googleapis.comforgemini, andhttp://localhost:1234/v1foropen-deep-research.timeout: Maximum time for research in seconds (default: 1800)poll_interval: Polling interval in seconds (default: 30)enable_clarification: Enable clarifying questions (default: False)triage_model: Model for query analysis (default:gpt-5-mini)clarifier_model: Model for query enrichment (default:gpt-5-mini)clarification_api_key: Custom API key for clarification models (optional; defaults to the main OpenAI credentials whenprovider=openai, otherwise falls back to envOPENAI_API_KEYif present)clarification_base_url: Custom OpenAI-compatible endpoint for clarification models (optional; defaults to the main OpenAI endpoint whenprovider=openai, otherwise falls back to envOPENAI_BASE_URLif present)
Development
Running Tests
# Install dev dependencies
uv sync --extra dev
# Run all tests
uv run pytest -v
# Run with coverage
uv run pytest --cov=deep_research_mcp tests/
# Run specific test file
uv run pytest tests/test_agents.pyLint, Format, Type Check
uv run black .
uv run pylint src/deep_research_mcp tests
uv run mypy src/deep_research_mcpMaintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/pminervini/deep-research-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server