Skip to main content
Glama
hfredrick69

Deep Research MCP Server

by hfredrick69

Deep Research MCP Server

Node.js TypeScript Gemini API MCP License: MIT

Your AI-Powered Research Assistant. Conduct iterative, deep research using Google Gemini 2.5 Flash with Google Search Grounding and URL context. No web-scraping dependency is required.


Table of Contents

The goal of this project is to provide the simplest yet most effective implementation of a deep research agent. It's designed to be easily understood, modified, and extended, aiming for a codebase under 500 lines of code (LoC).

Key Features:

  • MCP Integration: Runs as a Model Context Protocol (MCP) server/tool for seamless agent integration.

  • Gemini 2.5 Flash Pipeline: Long-context reasoning, structured JSON outputs, and tool use (Google Search Grounding, Code Execution, Functions) via env flags.

  • Iterative Deep Dive: Query refinement + result analysis with learned context carried forward.

  • Depth & Breadth Control: Tune exploration scope precisely.

  • Semantic/Recursive Splitting: Token-aware chunking for robust summarization and analysis.

  • Batching + Caching: Concurrency-limited batched model calls with LRU caches across prompts/results.

  • Professional Reports: Generates structured Markdown (Abstract, ToC, Intro, Body, Methodology, Limitations, Key Learnings, References).

Why This Project

  • Gemini-first, modern pipeline: Built around Gemini 2.5 Flash with optional tools (Search Grounding, Code Execution, Functions).

  • Minimal, understandable core: Plain TypeScript; easy to audit and extend.

  • Deterministic outputs: Zod-validated JSON and consistent report scaffolding.

  • Agent-ready: Clean MCP server entry; works with Inspector and MCP-aware clients.

Workflow Diagram

flowchart TB
    subgraph Input
        Q[User Query]
        B[Breadth Parameter]
        D[Depth Parameter]
    end

    DR[Deep Research] -->
    SQ[SERP Queries] -->
    PR[Process Results]

    subgraph Results[Results]
        direction TB
        NL((Learnings))
        ND((Directions))
    end

    PR --> NL
    PR --> ND

    DP{depth > 0?}

    RD["Next Direction:
    - Prior Goals
    - New Questions
    - Learnings"]

    MR[Markdown Report]

    %% Main Flow
    Q & B & D --> DR

    %% Results to Decision
    NL & ND --> DP

    %% Circular Flow
    DP -->|Yes| RD
    RD -->|New Context| DR

    %% Final Output
    DP -->|No| MR

    %% Styling
    classDef input fill:#7bed9f,stroke:#2ed573,color:black
    classDef process fill:#70a1ff,stroke:#1e90ff,color:black
    classDef recursive fill:#ffa502,stroke:#ff7f50,color:black
    classDef output fill:#ff4757,stroke:#ff6b81,color:black
    classDef results fill:#a8e6cf,stroke:#3b7a57,color:black

    class Q,B,D input
    class DR,SQ,PR process
    class DP,RD recursive
    class MR output
    class NL,ND results

Persona Agents

What are Persona Agents?

In deep-research, we utilize the concept of "persona agents" to guide the behavior of the Gemini language models. Instead of simply prompting the LLM with a task, we imbue it with a specific role, skills, personality, communication style, and values. This approach helps to:

  • Focus the LLM's Output: By defining a clear persona, we encourage the LLM to generate responses that are aligned with the desired expertise and perspective.

  • Improve Consistency: Personas help maintain a consistent tone and style throughout the research process.

  • Enhance Task-Specific Performance: Tailoring the persona to the specific task (e.g., query generation, learning extraction, feedback) optimizes the LLM's output for that stage of the research.

Examples of Personas in use:

  • Expert Research Strategist & Query Generator: Used for generating search queries, this persona emphasizes strategic thinking, comprehensive coverage, and precision in query formulation.

  • Expert Research Assistant & Insight Extractor: When processing web page content, this persona focuses on meticulous analysis, factual accuracy, and extracting key learnings relevant to the research query.

  • Expert Research Query Refiner & Strategic Advisor: For generating follow-up questions, this persona embodies strategic thinking, user intent understanding, and the ability to guide users towards clearer and more effective research questions.

  • Professional Doctorate Level Researcher (System Prompt): This overarching persona, applied to the main system prompt, sets the tone for the entire research process, emphasizing expert-level analysis, logical structure, and in-depth investigation.

By leveraging persona agents, deep-research aims to achieve more targeted, consistent, and high-quality research outcomes from the Gemini language models.

How It Works

Core modules:

  • src/deep-research.ts — orchestrates queries, batching, analysis, and synthesis

    • generateSerpQueries() uses Gemini to propose SERP-style queries from your prompt and prior learnings

    • processSerpResult() splits content, batches Gemini calls with tools enabled, extracts learnings and citations

    • conductResearch() runs analysis passes over semantic chunks

    • writeFinalReport() builds the final professional Markdown report

  • src/ai/providers.ts — GoogleGenAI wrapper for Gemini 2.5 Flash, batching, token control, optional tools

  • src/ai/text-splitter.ts — RecursiveCharacter and Semantic splitters

  • src/mcp-server.ts — MCP server entry point and types

  • src/run.ts — CLI entry point

Pipeline highlights:

  • Structured JSON outputs validated with Zod

  • Concurrency-limited batching (generateBatch, generateBatchWithTools)

  • LRU caches for prompts, SERP proposals, and reports

  • Optional Gemini tools via flags: Google Search Grounding, Code Execution, Functions

Project Structure

deep-research-mcp-server/
├─ src/
│  ├─ ai/
│  │  ├─ providers.ts           # Gemini wrapper, tools, batching, caching
│  │  └─ text-splitter.ts       # Semantic/recursive splitters
│  ├─ mcp-server.ts             # MCP server entry/types
│  ├─ deep-research.ts          # Orchestrator: queries → analysis → synthesis
│  ├─ prompt.ts                 # System + templates
│  ├─ feedback.ts               # Refinement/feedback loop
│  ├─ output-manager.ts         # Report/output formatting
│  ├─ progress-manager.ts       # CLI progress
│  ├─ terminal-utils.ts         # CLI helpers
│  ├─ types.ts                  # Zod schemas/types
│  └─ utils/                    # JSON/sanitize helpers
├─ dist/                        # Build output
├─ .env.example                 # Environment template
├─ package.json                 # Scripts/deps
└─ README.md

Requirements

Setup

Node.js

  1. Clone the repository:

    git clone [your-repo-link-here]
  2. Install dependencies:

    npm install
  3. Set up environment variables: Create a .env.local file in the project root:

    # Required
    GEMINI_API_KEY="your_gemini_key"
    
    # Recommended defaults
    GEMINI_MODEL=gemini-2.5-flash
    GEMINI_MAX_OUTPUT_TOKENS=65536
    CONCURRENCY_LIMIT=5
    
    # Gemini tools (enable as needed)
    ENABLE_GEMINI_GOOGLE_SEARCH=true
    ENABLE_GEMINI_CODE_EXECUTION=false
    ENABLE_GEMINI_FUNCTIONS=false
  4. Build the project:

    npm run build

Usage

As MCP Tool

To run deep-research as an MCP tool, start the MCP server:

node --env-file .env.local dist/mcp-server.js

You can then invoke the deep-research tool from any MCP-compatible agent using the following parameters:

  • query (string, required): The research query.

  • depth (number, optional, 1-5): Research depth (default: moderate).

  • breadth (number, optional, 1-5): Research breadth (default: moderate).

  • existingLearnings (string[], optional): Pre-existing research findings to guide research.

Example MCP Tool Arguments (JSON shape):

{
  "name": "deep-research",
  "arguments": {
    "query": "State of multi-agent research agents in 2025",
    "depth": 3,
    "breadth": 3,
    "existingLearnings": [
      "Tool use improves grounding",
      "Batching reduces latency"
    ]
  }
}
const mcp = new ModelContextProtocolClient(); // Assuming MCP client is initialized

async function invokeDeepResearchTool() {
  try {
    const result = await mcp.invoke("deep-research", {
      query: "Explain the principles of blockchain technology",
      depth: 2,
      breadth: 4
    });

    if (result.isError) {
      console.error("MCP Tool Error:", result.content[0].text);
    } else {
      console.log("Research Report:\n", result.content[0].text);
      console.log("Sources:\n", result.metadata.sources);
    }
  } catch (error) {
    console.error("MCP Invoke Error:", error);
  }
}

invokeDeepResearchTool();

Standalone CLI Usage

To run deep-research directly from the command line:

npm run start "your research query"

Example:

npm run start "what are latest developments in ai research agents"

MCP Inspector Testing

For interactive testing and debugging of the MCP server, use the MCP Inspector:

npx @modelcontextprotocol/inspector node --env-file .env.local dist/mcp-server.js

MCP Integration Tips

  • Environment: Provide GEMINI_API_KEY to the MCP server process; model and tool flags via env.

  • Stateless calls: The server derives behavior from env; keep flags in sync with your client profile.

  • Latency: Enable batching and reasonable CONCURRENCY_LIMIT to balance speed vs rate limits.

Deployment Modes

The server supports two deployment modes:

Local Mode (stdio)

For Claude Code, Gemini CLI, and other local MCP clients:

node --env-file .env.local dist/mcp-server.js

Behavior:

  • Small reports (≤50KB): Returned inline in the response

  • Large reports (>50KB): Uploaded to GCS, URL returned with download instructions

HTTP Mode (Cloud Run / Remote)

For Codex and other remote MCP clients:

# Set environment variable
export MCP_HTTP_MODE=true

# Or deploy to Cloud Run
gcloud run deploy deep-research-mcp \
  --source . \
  --region=us-central1 \
  --allow-unauthenticated \
  --set-env-vars="MCP_HTTP_MODE=true,GEMINI_API_KEY=xxx,MCP_API_KEY=xxx,GCS_BUCKET_NAME=xxx"

Behavior:

  • All reports uploaded to GCS

  • Response includes signed URL (7-day expiration) with curl download command

  • API key authentication via x-api-key header

Endpoints:

  • POST /mcp - MCP protocol endpoint

  • GET /health - Health check

  • GET /sse - SSE transport (alternative to StreamableHTTP)

Google Cloud Storage Integration

Large reports are automatically stored in GCS to avoid transport size limits.

Setup:

  1. Create a GCS bucket:

gcloud storage buckets create gs://your-bucket-name --location=us-central1
  1. Grant permissions (for Cloud Run):

# Get service account
SA=$(gcloud run services describe deep-research-mcp --region=us-central1 --format="value(spec.template.spec.serviceAccountName)")

# Grant storage permissions
gcloud storage buckets add-iam-policy-binding gs://your-bucket-name \
  --member="serviceAccount:$SA" \
  --role="roles/storage.objectAdmin"

# Grant URL signing permissions
gcloud iam service-accounts add-iam-policy-binding $SA \
  --member="serviceAccount:$SA" \
  --role="roles/iam.serviceAccountTokenCreator"
  1. Set environment variable:

export GCS_BUCKET_NAME=your-bucket-name

Size-Based Auto-Switching

The server automatically switches to URL mode when reports exceed 50KB:

Report Size

stdio Mode

HTTP Mode

≤ 50 KB

Inline content

GCS URL

> 50 KB

GCS URL

GCS URL

This prevents token limit issues in Claude Code, Gemini CLI, and other clients.

Configuration

Required

Variable

Description

GEMINI_API_KEY

Google Gemini API key

Gemini Settings

Variable

Default

Description

GEMINI_MODEL

gemini-2.5-flash

Gemini model to use

GEMINI_MAX_OUTPUT_TOKENS

65536

Maximum output tokens

CONCURRENCY_LIMIT

5

Concurrent API calls

ENABLE_GEMINI_GOOGLE_SEARCH

false

Enable Google Search Grounding

ENABLE_GEMINI_CODE_EXECUTION

false

Enable code execution tool

ENABLE_GEMINI_FUNCTIONS

false

Enable function calling

Server Settings

Variable

Default

Description

MCP_HTTP_MODE

false

Enable HTTP transport (for Cloud Run)

MCP_API_KEY

(none)

API key for HTTP authentication

PORT

8080

HTTP server port

Storage Settings

Variable

Default

Description

GCS_BUCKET_NAME

(none)

GCS bucket for report storage

Optional providers (planned/behind flags): Exa/Tavily can be integrated later; Firecrawl is not required for the current pipeline.

Quickstart

  1. Clone and install

git clone https://github.com/ssdeanx/deep-research-mcp-server
cd deep-research-mcp-server
npm i && npm run build
  1. Create .env.local (see Setup)

  2. Run as MCP server (Inspector)

npx @modelcontextprotocol/inspector node --env-file .env.local dist/mcp-server.js
  1. Or run as CLI

npm run start "state of multi-agent research agents in 2025"

Example Output

# Abstract
Concise overview of the research goal, scope, method, and key findings.

# Table of Contents
...

# Introduction
Context and framing.

# Body
Evidence-backed sections with citations.

# Methodology
How sources were found and analyzed.

# Limitations
Assumptions and risks.

# Key Learnings
Bulleted insights and takeaways.

# References
Normalized citations to visited URLs.

Support

  • Issues: Use GitHub Issues for bugs and feature requests.

  • Discussions: Propose ideas or ask questions.

  • Security: Do not file public issues for sensitive disclosures; contact maintainers privately.

Contributing

  • PRs welcome: Please open an issue first for significant changes.

  • Standards: TypeScript 5.x, Node.js 22.x, lint/type-check before PRs.

  • Checks: npm run build and tsc --noEmit must pass.

  • Docs: Update README.md and .env.example when changing env/config.

Roadmap

  • Exa search integration (behind ENABLE_EXA_PRIMARY), Google grounding for augmentation.

  • Provider cleanup: Remove Firecrawl after Exa migration (explicit approval required).

  • CI/CD: Add GitHub Actions for build/lint/test and badge.

  • Examples: Add sample reports and prompts.

Troubleshooting

  • Missing API key: Ensure GEMINI_API_KEY is set in .env.local and processes are started with --env-file .env.local.

  • Model/tool flags: If grounding or functions aren’t active, verify ENABLE_GEMINI_GOOGLE_SEARCH, ENABLE_GEMINI_CODE_EXECUTION, ENABLE_GEMINI_FUNCTIONS.

  • Rate limits/latency: Lower CONCURRENCY_LIMIT (e.g., 3) or rerun with fewer simultaneous queries.

  • Output too long: Reduce depth/breadth or lower GEMINI_MAX_OUTPUT_TOKENS.

  • Schema parse errors: Rerun; the pipeline validates/repairs JSON, but extreme prompts may exceed budgets—trim prompt or reduce chunk size.

License

MIT License - Free and Open Source. Use it freely!


🚀 Let's dive deep into research! 🚀

Recent Improvements (v0.3.0)

✨ Highlights of the latest changes. See also Roadmap.

  • ✅ Input validation: Minimum 10 characters + 3 words

  • 📈 Output validation: Citation density (1.5+ per 100 words)

  • 🔍 Recent sources check (3+ post-2019 references)

  • ⚖️ Conflict disclosure enforcement

  • Consolidated on Gemini 2.5 Flash (long-context, structured JSON)

  • Optional tools via env flags: Search Grounding, Code Execution, Functions

  • Semantic + recursive splitting for context management

  • Robust batching with concurrency control and caching

  • Enhanced context management via semantic search

  • Improved error handling and logging

  • 🚀 Added concurrent processing pipeline

  • Removed redundant academic-validators module

  • 🛡️ Enhanced type safety across interfaces

  • 📦 Optimized dependencies (≈30% smaller node_modules)

  • 📊 Research metrics tracking (sources/learnings ratio)

Performance:

  • 🚀 30% faster research cycles

  • ⚡ 40% faster initial research cycles

  • 📉 60% reduction in API errors

  • 🧮 25% more efficient token usage

-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hfredrick69/deep-research-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server