Skip to main content
Glama

PubMed Search MCP

PyPI version Python 3.10+ License: Apache 2.0 MCP Test Coverage

Professional Literature Research Assistant for AI Agents - More than just an API wrapper

PubMed Search MCP research workflow

A Domain-Driven Design (DDD) based MCP server that serves as an intelligent research assistant for AI agents, providing task-oriented literature search and analysis capabilities.

✨ What's Included:

  • πŸ”§ 46 MCP Tools - Streamlined PubMed, Europe PMC, CORE, NCBI database access, and Research Timeline / Context Graph

  • πŸ–ΌοΈ OA Figure Extraction - Pull figure captions, direct image URLs, and PDF links from PMC Open Access articles

  • πŸ“˜ Docs Site - Browse language-switchable user and developer guides, architecture, quick reference, pipeline tutorials, source contracts, troubleshooting, and deployment in one place at u9401066.github.io/pubmed-search-mcp

  • πŸ“– GitHub Wiki - GitHub-native mirror of the same canonical documentation at github.com/u9401066/pubmed-search-mcp/wiki

  • πŸ“š 24 Claude Skills - Ready-to-use workflow guides for AI agents (Claude Code-specific)

  • πŸ“– Copilot Instructions - VS Code GitHub Copilot integration guide

🌐 Language: English | 繁體中文

πŸ“˜ Documentation Map: README is the quick project entry point. Use the Docs Site for the best reading experience, the GitHub Wiki for GitHub-native navigation, and source docs for edits: User guide | Advanced workflows | Capability-first guide | Developer guide | Complete index


πŸš€ Quick Install

Prerequisites

  • Python 3.10+ β€” Download

  • uv (recommended) β€” Install uv

    # macOS / Linux
    curl -LsSf https://astral.sh/uv/install.sh | sh
    # Windows
    powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
  • NCBI Email β€” Required by NCBI API policy. Any valid email address.

  • NCBI API Key (optional) β€” Get one here for higher rate limits (10 req/s vs 3 req/s)

  • OpenAlex API Key (optional) β€” set OPENALEX_API_KEY to use authenticated OpenAlex requests instead of mailto-only polite-pool auth

Install & Run

# Option 1: Zero-install with uvx (recommended for trying out)
uvx pubmed-search-mcp

# Option 2: Add as project dependency
uv add pubmed-search-mcp

# Option 3: pip install
pip install pubmed-search-mcp

βš™οΈ Configuration

This MCP server works with any MCP-compatible AI tool. Choose your preferred client:

VS Code / Cursor (.vscode/mcp.json)

{
  "servers": {
    "pubmed-search": {
      "type": "stdio",
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

Optional: enable browser-session PDF fallback once and let tools auto-use it:

{
  "servers": {
    "pubmed-search": {
      "type": "stdio",
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com",
        "BROWSER_FETCH_CONFIG": "{\"enabled\":true,\"auto_enabled\":true,\"broker_url\":\"http://127.0.0.1:8766/fetch\",\"token\":\"local-dev-token\",\"allowed_hosts\":[\"jamanetwork.com\",\"*.jamanetwork.com\",\"nejm.org\",\"*.nejm.org\"]}"
      }
    }
  }
}

With this setting, get_fulltext will automatically try the local broker for institutional or publisher landing pages. Pass allow_browser_session=false only when you want to suppress it for a specific call.

Run the local broker with download interception:

uv sync --extra browser-broker
uv run playwright install chromium
uv run pubmed-browser-fetch-broker --token local-dev-token

The broker launches a persistent browser profile with download interception enabled. Log in once inside that broker-controlled browser window, and subsequent PDF downloads will be captured automatically without a native "Save As" dialog.

Claude Desktop (claude_desktop_config.json)

{
  "mcpServers": {
    "pubmed-search": {
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

Config file location:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

  • Windows: %APPDATA%\Claude\claude_desktop_config.json

  • Linux: ~/.config/Claude/claude_desktop_config.json

Claude Code

claude mcp add pubmed-search -- uvx pubmed-search-mcp

Or add to .mcp.json in your project root:

{
  "mcpServers": {
    "pubmed-search": {
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

Zed AI (settings.json)

Zed editor (z.ai) supports MCP servers natively. Add to your Zed settings.json:

{
  "context_servers": {
    "pubmed-search": {
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com"
      }
    }
  }
}

Tip: Open Command Palette β†’ zed: open settings to edit, or go to Agent Panel β†’ Settings β†’ "Add Custom Server".

OpenClaw 🦞 (~/.openclaw/openclaw.json)

OpenClaw uses MCP servers via the mcp-adapter plugin. Install the adapter first:

openclaw plugins install mcp-adapter

Then add to ~/.openclaw/openclaw.json:

{
  "plugins": {
    "entries": {
      "mcp-adapter": {
        "enabled": true,
        "config": {
          "servers": [
            {
              "name": "pubmed-search",
              "transport": "stdio",
              "command": "uvx",
              "args": ["pubmed-search-mcp"],
              "env": {
                "NCBI_EMAIL": "your@email.com"
              }
            }
          ]
        }
      }
    }
  }
}

Restart the gateway after configuration:

openclaw gateway restart
openclaw plugins list  # Should show: mcp-adapter | loaded

Cline (cline_mcp_settings.json)

{
  "mcpServers": {
    "pubmed-search": {
      "command": "uvx",
      "args": ["pubmed-search-mcp"],
      "env": {
        "NCBI_EMAIL": "your@email.com",
        "S2_API_KEY": "your_semantic_scholar_key",
        "PUBMED_SEARCH_DISABLED_SOURCES": ""
      },
      "alwaysAllow": [],
      "disabled": false
    }
  }
}

Other MCP Clients

Any MCP-compatible client can use this server via stdio transport:

# Command
uvx pubmed-search-mcp

# With environment variable
NCBI_EMAIL=your@email.com uvx pubmed-search-mcp

Note: NCBI_EMAIL is required by NCBI API policy. Optionally set NCBI_API_KEY for higher rate limits (10 req/s vs 3 req/s). πŸ“– Detailed Integration Guides: See docs/INTEGRATIONS.md for all environment variables, Copilot Studio setup, Docker deployment, proxy configuration, and troubleshooting.


🎯 Design Philosophy

Core Positioning: The intelligent middleware between AI Agents and academic search engines.

Why This Server?

Other tools give you raw API access. We give you vocabulary translation + intelligent routing + research analysis:

Challenge

Our Solution

Agent uses ICD codes, PubMed needs MeSH

βœ… Auto ICDβ†’MeSH conversion

Multiple databases, different APIs

βœ… Unified Search single entry point

Clinical questions need structured search

βœ… PICO handoff + pipeline (parse_pico validates agent-provided P/I/C/O and returns a runnable template: pico pipeline)

Typos in medical terms

βœ… ESpell auto-correction

Too many results from one source

βœ… Parallel multi-source with dedup

Need to trace research evolution

βœ… Research Timeline & Tree with landmark detection, diagnostics, and sub-topic branching

Citation context is unclear

βœ… Citation Tree forward/backward/network

Can't access full text

βœ… Multi-source fulltext (Europe PMC XML, Unpaywall OA locations, institutional direct/EZproxy, CORE, and downloader fallbacks)

Gene/drug info scattered across DBs

βœ… NCBI Extended (Gene, PubChem, ClinVar)

Need cutting-edge preprints

βœ… Preprint search (arXiv, medRxiv, bioRxiv) with peer-review filtering

Export to reference managers

βœ… One-click export (official RIS/MEDLINE/CSL JSON; local RIS/BibTeX/CSV/MEDLINE/JSON)

Key Differentiators

  1. Vocabulary Translation Layer - Agent speaks naturally, we translate to each database's terminology (MeSH, ICD-10, text-mined entities)

  2. Unified Search Gateway - One unified_search() call, auto-dispatch to PubMed/Europe PMC/CORE/OpenAlex

  3. PICO Handoff + Pipeline - the Agent extracts P/I/C/O, parse_pico() validates that structured handoff, and the backend template: pico pipeline executes O-aware precision/recall searches

  4. Research Timeline & Lineage Tree - Detect milestones with policy-driven heuristics, identify landmark papers via multi-signal scoring, surface timeline diagnostics, and visualize research evolution as branching trees by sub-topic

  5. Citation Network Analysis - Build multi-level citation trees to map an entire research landscape from a single paper

  6. Full Research Lifecycle - From search β†’ discovery β†’ full text β†’ analysis β†’ export, all in one server

  7. Agent-First Design - Output optimized for machine decision-making, not human reading


πŸ“‘ External APIs & Data Sources

This MCP server integrates with multiple academic databases and APIs:

Core Data Sources

Source

Coverage

Vocabulary

Auto-Convert

Description

NCBI PubMed

36M+ articles

MeSH

βœ… Native

Primary biomedical literature

NCBI Entrez

Multi-DB

MeSH

βœ… Native

Gene, PubChem, ClinVar

Europe PMC

33M+

Text-mined

βœ… Extraction

Full text XML access

CORE

200M+

None

➑️ Free-text

Open access aggregator

Semantic Scholar

200M+

S2 Fields

➑️ Free-text

AI-powered recommendations

OpenAlex

250M+

Concepts

➑️ Free-text

Open scholarly metadata

NIH iCite

PubMed

N/A

N/A

Citation metrics (RCR)

πŸ”‘ Key: βœ… = Full vocabulary support | ➑️ = Query pass-through (no controlled vocabulary)

ICD Codes: Auto-detected and converted to MeSH before PubMed search

Environment Variables

# Required
NCBI_EMAIL=your@email.com          # Required by NCBI policy

# Optional - For higher rate limits
NCBI_API_KEY=your_ncbi_api_key     # Get from: https://www.ncbi.nlm.nih.gov/account/settings/
CORE_API_KEY=your_core_api_key     # Get from: https://core.ac.uk/services/api
CROSSREF_EMAIL=your@email.com      # CrossRef polite pool
UNPAYWALL_EMAIL=your@email.com     # Unpaywall OA resolver
S2_API_KEY=your_s2_api_key         # Alias: SEMANTIC_SCHOLAR_API_KEY
PUBMED_SEARCH_DISABLED_SOURCES=    # Example: semantic_scholar

# Optional - Network settings
HTTP_PROXY=http://proxy:8080       # HTTP proxy for API requests
HTTPS_PROXY=https://proxy:8080     # HTTPS proxy for API requests

# Optional - Institutional fulltext access
INSTITUTIONAL_DIRECT_FETCH=true    # Try DOI publisher pages before CORE fallback
EZPROXY_ENABLED=false              # Enable only after configuring EZPROXY_HOST + cookie
EZPROXY_HOST=ezproxy.example.edu
EZPROXY_COOKIE_FILE=/path/to/cookies.json

# Optional - Local note export
PUBMED_NOTES_DIR=/path/to/wiki/references  # save_literature_notes target folder
PUBMED_WORKSPACE_DIR=/path/to/project       # fallback: references/ under this workspace
PUBMED_DATA_DIR=~/.pubmed-search-mcp        # fallback: references/ under this data dir

Local note export resolves directories in this order: output_dir argument, PUBMED_NOTES_DIR, PUBMED_WORKSPACE_DIR/references, PUBMED_DATA_DIR/references, then ~/.pubmed-search-mcp/references. For LLM wiki compatibility, wiki and foam exports use stable link targets based on PMID, DOI, PMCID, or fallback identifiers; titles remain aliases/display labels, and the response includes wiki_validation for unresolved wikilink checks.

πŸ”„ How It Works: The Middleware Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                              AI AGENT                                        β”‚
β”‚                                                                              β”‚
β”‚   "Find papers about I10 hypertension treatment in diabetic patients"       β”‚
β”‚                                                                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     πŸ”„ PUBMED SEARCH MCP (MIDDLEWARE)                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚  1️⃣ VOCABULARY TRANSLATION                                              β”‚β”‚
β”‚  β”‚     β€’ ICD-10 "I10" β†’ MeSH "Hypertension"                                β”‚β”‚
β”‚  β”‚     β€’ "diabetic" β†’ MeSH "Diabetes Mellitus"                             β”‚β”‚
β”‚  β”‚     β€’ ESpell: "hypertention" β†’ "hypertension"                           β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”β”‚
β”‚  β”‚  2️⃣ INTELLIGENT ROUTING                                                 β”‚β”‚
β”‚  β”‚     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”‚β”‚
β”‚  β”‚     β”‚ PubMed   β”‚  β”‚Europe PMCβ”‚  β”‚   CORE   β”‚  β”‚ OpenAlex β”‚             β”‚β”‚
β”‚  β”‚     β”‚  36M+    β”‚  β”‚   33M+   β”‚  β”‚  200M+   β”‚  β”‚  250M+   β”‚             β”‚β”‚
β”‚  β”‚     β”‚  (MeSH)  β”‚  β”‚(fulltext)β”‚  β”‚  (OA)    β”‚  β”‚(metadata)β”‚             β”‚β”‚
β”‚  β”‚     β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜             β”‚β”‚
β”‚  β”‚          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                 β”‚β”‚
β”‚  β”‚                              β–Ό                                          β”‚β”‚
β”‚  β”‚  3️⃣ RESULT AGGREGATION: Dedupe + Rank + Enrich                         β”‚β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         UNIFIED RESULTS                                      β”‚
β”‚   β€’ 150 unique papers (deduplicated from 4 sources)                          β”‚
β”‚   β€’ Ranked by relevance + citation impact (RCR)                              β”‚
β”‚   β€’ Full text links enriched from Europe PMC                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ› οΈ MCP Tools Overview

If you want to understand the tool surface as a usable system, do not start by memorizing 46 tool names.

Start with the Tools Usage Guide: it compresses the current 46 tools into 8 capability families, explains the theoretical lower bound, and gives intent-based routing for both humans and agents.

πŸ” Search & Query Intelligence

Search and query intelligence workflow

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                      SEARCH ENTRY POINT                          β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                  β”‚
β”‚   unified_search()          ← 🌟 Single entry for all sources    β”‚
β”‚        β”‚                                                         β”‚
β”‚        β”œβ”€β”€ Quick search     β†’ Direct multi-source query          β”‚
β”‚        β”œβ”€β”€ PICO hints       β†’ Detects comparison, shows P/I/C/O  β”‚
β”‚        └── ICD expansion    β†’ Auto ICDβ†’MeSH conversion           β”‚
β”‚                                                                  β”‚
β”‚   Sources: PubMed Β· Europe PMC Β· CORE Β· OpenAlex                 β”‚
β”‚   Auto: Deduplicate β†’ Rank β†’ Enrich full-text links              β”‚
β”‚                                                                  β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚   QUERY INTELLIGENCE                                             β”‚
β”‚                                                                  β”‚
β”‚   generate_search_queries() β†’ MeSH expansion + synonym discovery β”‚
β”‚   parse_pico()              β†’ Agent-provided PICO handoff        β”‚
β”‚   analyze_search_query()    β†’ Query analysis without execution   β”‚
β”‚                                                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”¬ Discovery Tools (After Finding Key Papers)

Article discovery and citation workflow

                        Found important paper (PMID)
                                   β”‚
           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
           β”‚                       β”‚                       β”‚
           β–Ό                       β–Ό                       β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  BACKWARD   β”‚        β”‚  SIMILAR    β”‚        β”‚  FORWARD    β”‚
    β”‚  ◀──────    β”‚        β”‚  β‰ˆβ‰ˆβ‰ˆβ‰ˆβ‰ˆβ‰ˆ     β”‚        β”‚  ──────▢    β”‚
    β”‚             β”‚        β”‚             β”‚        β”‚             β”‚
    β”‚ get_article β”‚        β”‚find_related β”‚        β”‚find_citing  β”‚
    β”‚ _references β”‚        β”‚ _articles   β”‚        β”‚ _articles   β”‚
    β”‚             β”‚        β”‚             β”‚        β”‚             β”‚
    β”‚ Foundation  β”‚        β”‚  Similar    β”‚        β”‚ Follow-up   β”‚
    β”‚  papers     β”‚        β”‚   topic     β”‚        β”‚  research   β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

    fetch_article_details()   β†’ Detailed article metadata
    get_citation_metrics()    β†’ iCite RCR, citation percentile
    build_citation_tree()     β†’ Full network visualization (6 formats)

πŸ“š Full Text, Figure Extraction & Export

Full text, figures, and biomedical image workflow

Category

Tools

Full Text

get_fulltext β†’ Europe PMC XML when a PMCID is available; DOI-backed Unpaywall, institutional direct/EZproxy, CORE, and downloader fallbacks when needed

Figures

get_article_figures β†’ Extract figure labels, captions, image URLs, and PDF links from PMC Open Access articles

Figure-aware Full Text

get_fulltext(include_figures=True) β†’ Embed figure metadata alongside structured fulltext

Text Mining

get_text_mined_terms β†’ Extract genes, diseases, chemicals

Export

prepare_export β†’ official RIS/MEDLINE/CSL JSON or local RIS/BibTeX/CSV/MEDLINE/JSON; save_literature_notes β†’ local wiki/Foam-compatible/Markdown/MedPaper-style notes plus collection-level CSL JSON

πŸ–ΌοΈ OA Figure-First Exploration

Use the PMC Open Access path when an agent needs evidence figures, not just article text:

  • get_article_figures(identifier="PMC12086443") β†’ Figure labels, captions, image URLs, and PDF/article links

  • get_fulltext(pmcid="PMC7096777", include_figures=True) β†’ Structured fulltext with figures inline

  • Figure output preserves article context, so agents can connect each figure back to the sections where it is mentioned

🧬 NCBI Extended Databases

NCBI extended biomedical data workflow

Tool

Description

search_gene

Search NCBI Gene database

get_gene_details

Gene details by NCBI Gene ID

get_gene_literature

PubMed articles linked to a gene

search_compound

Search PubChem compounds

get_compound_details

Compound details by PubChem CID

get_compound_literature

PubMed articles linked to a compound

search_clinvar

Search ClinVar clinical variants

πŸ•°οΈ Research Timeline & Lineage Tree

Evaluation and timeline workflow

Tool

Description

build_research_timeline

Build timeline/tree with landmark detection and formatted diagnostics. Output: text, tree, mermaid, mindmap, json

analyze_timeline_milestones

Analyze milestone distribution with diagnostics payload

compare_timelines

Compare multiple topic timelines with per-topic diagnostics

πŸ₯ Institutional Access & ICD Conversion

Institutional access workflow

Tool

Description

configure_institutional_access

Configure institution's link resolver

get_institutional_link

Generate OpenURL access link

list_resolver_presets

List resolver presets

test_institutional_access

Test resolver configuration

diagnose_institutional_access

Diagnose direct DOI, EZproxy, and OpenURL handoff paths

convert_icd_mesh

Convert between ICD codes and MeSH terms (bidirectional)

unified_search

Auto-detect ICD codes in queries and expand them to MeSH

πŸ’Ύ Session Management

Session and pipeline workflow

Tool

Description

get_session_pmids

Retrieve cached PMID lists

get_cached_article

Get article from session cache (no API cost)

get_session_summary

Session status overview

read_session

Facade for PMIDs, cached articles, history, and persistent artifacts

Dynamic MCP resources are also available for agents that can read resources directly:

  • session://context β€” active session status

  • session://last-search β€” latest search metadata

  • session://last-search/pmids β€” latest PMID list + CSV form

  • session://last-search/results β€” cached article payloads for the latest search

Persistent Artifacts

Persistent MCP output artifacts are saved for reusable unified_search and get_fulltext responses when session persistence is configured. Tool responses include a compact artifact locator with artifact_id, artifact_uri, primary_file, summary, file inventory, and an exact read_session(...) retrieval hint. Set PUBMED_ARTIFACT_INCLUDE_LOCAL_PATHS=true when a local MCP client should also receive local_path and manifest_path directly.

Remote clients that cannot read the server filesystem can retrieve the same content through the session facade:

read_session(action="list_artifacts")
read_session(action="artifact", artifact_id="...")
read_session(action="artifact", artifact_uri="artifact://...")
read_session(action="artifact", artifact_id="...", artifact_file="payload.json", offset=0, max_chars=200000)
read_session(action="list_artifacts", include_local_paths=true)

Artifacts are generated from the already-computed result object, so reading an artifact does not rerun searches or fulltext retrieval. read_session redacts local filesystem paths by default; local_path and manifest_path are server-local paths, not portable client paths. Artifacts from get_fulltext may contain article body text, including subscription or institutionally accessed content. Store and share them according to publisher, license, and institutional access terms. Large get_fulltext responses are returned inline as a preview when an artifact is available; use the artifact locator to retrieve the saved full content.

When one source fails but the overall search can continue, JSON responses may include source_errors; markdown responses show a Source warnings line. For Semantic Scholar HTTP 429s, set S2_API_KEY / SEMANTIC_SCHOLAR_API_KEY, retry later, or temporarily exclude it with sources="auto,-semantic_scholar" or PUBMED_SEARCH_DISABLED_SOURCES=semantic_scholar.

Pipeline Management

Session and pipeline workflow

manage_pipeline is the primary facade for pipeline CRUD, history, and scheduling. The more specific pipeline tools remain available as compatibility wrappers.

Tool

Description

manage_pipeline

Primary facade for save, list, load, delete, history, and schedule actions

save_pipeline

Save a pipeline config for later reuse (YAML/JSON, auto-validated)

list_pipelines

List saved pipelines (filter by tag/scope)

load_pipeline

Load pipeline from name or file for review/editing

delete_pipeline

Delete pipeline and its execution history

get_pipeline_history

View execution history with article diff analysis

schedule_pipeline

Create, update, or remove recurring pipeline schedules

Step-by-step tutorials:

Full text, figures, and biomedical image workflow

Tool

Description

analyze_figure_for_search

Handoff an uploaded image, image URL, or data URI to agent vision for search-term extraction

search_biomedical_images

Search biomedical images across Open-i (X-ray, microscopy, photos, diagrams)

Use analyze_figure_for_search when the user supplies an image and the agent must interpret its meaning first. The tool returns MCP ImageContent plus instructions for the LLM agent to extract English biomedical terms, then continue with search_biomedical_images for similar Open-i images or unified_search for related papers.

Search arXiv, medRxiv, and bioRxiv preprint servers via unified_search options flags:

  • preprints: Search preprint servers and merge preprints into the main aggregated result set with article_type=PREPRINT.

  • all_types: Keep non-peer-reviewed content already returned by selected scholarly sources even without a preprint-server crawl.

Recommended combinations:

  • Empty options: Peer-reviewed results only; preprint-like records are filtered.

  • options="preprints": Searches arXiv, medRxiv, and bioRxiv, then ranks/dedupes those preprints with the main results.

  • options="preprints, all_types": Same preprint-server crawl, plus other non-peer-reviewed records from selected sources are retained.

  • options="all_types": No preprint-server crawl, but non-peer-reviewed items from searched sources are retained.

Preprint detection β€” articles are identified as preprints by:

  • Article type from source API (OpenAlex, CrossRef, Semantic Scholar)

  • arXiv ID present without PubMed ID

  • Known preprint server source or journal name

  • DOI prefix matching preprint servers (e.g., 10.1101/ β†’ bioRxiv/medRxiv, 10.48550/ β†’ arXiv)

🌳 Research Context Graph

unified_search can append a lightweight research lineage view built from PMID-backed ranked results:

Option Flag

Description

context_graph

Append a Research Context Graph preview to Markdown output and include research_context in JSON output

This is useful when an agent needs quick thematic branching without making a second build_research_timeline call.

πŸ“Š Count-First Orientation

unified_search can also front-load the existing source coverage and decision hints for agents that want routing help before reading the ranked list:

Option Flag

Description

counts_first

Add a source-count table, coverage summary, and next-tool recommendations to the response

Example:

unified_search(query="remimazolam ICU sedation", options="counts_first")

This mode is useful when the agent should decide whether to expand a source, inspect the lead PMID, fetch fulltext, extract figures, or pivot into timeline exploration.

⏱️ MCP Progress Reporting

When the MCP client provides a progress token, unified_search, build_research_timeline, analyze_timeline_milestones, compare_timelines, get_fulltext, and get_text_mined_terms emit progress updates for their major phases. This reduces the "black box" wait time for agents during longer searches.


πŸ“‹ Agent Usage Examples

1️⃣ Quick Search (Simplest)

# Agent just asks naturally - middleware handles everything
unified_search(query="remimazolam ICU sedation", limit=20)

# Or with clinical codes - auto-converted to MeSH
unified_search(query="I10 treatment in E11.9 patients")
#                     ↑ ICD-10           ↑ ICD-10
#                     Hypertension       Type 2 Diabetes

2️⃣ PICO Clinical Question

PICO clinical search workflow

Simple path β€” unified_search can search directly (no PICO decomposition):

# unified_search searches as-is; detects "A vs B" pattern and shows PICO hints in metadata
unified_search(query="Is remimazolam better than propofol for ICU sedation?")
# β†’ Multi-source keyword search + PICO hint metadata in output
# ⚠️ This does NOT auto-decompose PICO or expand MeSH!
# For structured PICO search, use the Agent workflow below

Agent workflow β€” agent-provided PICO + backend pipeline search (recommended for clinical questions):

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  "Is remimazolam better than propofol for ICU sedation?"                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                         parse_pico()                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”                     β”‚
β”‚  β”‚    P    β”‚  β”‚    I    β”‚  β”‚    C    β”‚  β”‚    O    β”‚                     β”‚
β”‚  β”‚  ICU    β”‚  β”‚remimaz- β”‚  β”‚propofol β”‚  β”‚sedation β”‚                     β”‚
β”‚  β”‚patients β”‚  β”‚  olam   β”‚  β”‚         β”‚  β”‚outcomes β”‚                     β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
        β”‚            β”‚            β”‚            β”‚
        β–Ό            β–Ό            β–Ό            β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              generate_search_queries() Γ— 4 (parallel)                    β”‚
β”‚                                                                          β”‚
β”‚  P β†’ "Intensive Care Units"[MeSH]                                        β”‚
β”‚  I β†’ "remimazolam" [Supplementary Concept], "CNS 7056"                   β”‚
β”‚  C β†’ "Propofol"[MeSH], "Diprivan"                                        β”‚
β”‚  O β†’ "Conscious Sedation"[MeSH], "Deep Sedation"[MeSH]                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Agent combines with Boolean logic                           β”‚
β”‚                                                                          β”‚
β”‚  (P) AND (I) AND (C) AND (O)  ← High precision                           β”‚
β”‚  (P) AND (I OR C) AND (O)     ← High recall                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                  β”‚
                                  β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              unified_search() (auto multi-source + dedup)                β”‚
β”‚                                                                          β”‚
β”‚  PubMed + Europe PMC + CORE + OpenAlex β†’ Auto deduplicate & rank         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
# Step 1: Agent extracts P/I/C/O, then validates the structured handoff
pico = parse_pico(
    description="Is remimazolam better than propofol for ICU sedation?",
    p="ICU patients requiring sedation",
    i="remimazolam",
    c="propofol",
    o="sedation efficacy, delirium, hypotension"
)
# Returns validation plus a ready-to-run `template: pico` pipeline.

# Step 2: Get MeSH for each element (parallel!)
generate_search_queries(topic="ICU patients")   # P
generate_search_queries(topic="remimazolam")    # I
generate_search_queries(topic="propofol")       # C
generate_search_queries(topic="sedation")       # O

# Step 3: Either pass expanded fragments back as p_query/i_query/c_query/o_query
# or let the backend pipeline use the structured P/I/C/O labels.

# Step 4: Search (backend runs O-aware precision/recall searches, dedup, rank)
unified_search(
    query="Is remimazolam better than propofol for ICU sedation?",
    pipeline=pico["pipeline"]
)

3️⃣ Explore from Key Paper

# Found landmark paper PMID: 33475315
find_related_articles(pmid="33475315")   # Similar methodology
find_citing_articles(pmid="33475315")    # Who built on this?
get_article_references(pmid="33475315")  # What's the foundation?

# Build complete research map
build_citation_tree(pmid="33475315", depth=2, output_format="mermaid")

4️⃣ Gene/Drug Research

# Research a gene
search_gene(query="BRCA1", organism="human")
get_gene_literature(gene_id="672", limit=20)

# Research a drug compound
search_compound(query="propofol")
get_compound_literature(cid="4943", limit=20)

5️⃣ Export Results

# Export last search results
prepare_export(pmids="last", format="ris")      # β†’ EndNote/Zotero
prepare_export(pmids="last", format="bibtex", source="local")  # β†’ LaTeX
prepare_export(pmids="last", format="csl")      # β†’ CSL JSON from the official NCBI Citation API
save_literature_notes(pmids="last")              # β†’ local wiki note + Foam-compatible wikilinks + CSL JSON
save_literature_notes(pmids="last", note_format="medpaper", output_dir="./references")
save_literature_notes(pmids="last", template_file="./reference-template.md")

# Retrieve full text for a selected paper from the last search
get_fulltext(pmid="12345678", extended_sources=True)
# Include preprints alongside peer-reviewed results
unified_search(query="COVID-19 vaccine efficacy", options="preprints")
# β†’ Main aggregated results include labelled arXiv, medRxiv, and bioRxiv preprints

# Include preprints and retain non-peer-reviewed items in main results
unified_search(query="CRISPR gene therapy", options="preprints, all_types")
# β†’ Preprint-server crawl + non-peer-reviewed items retained in main results

# Only peer-reviewed (default behavior)
unified_search("diabetes treatment")
# β†’ Preprints from any source automatically filtered out

# Add a research context graph preview to the same search response
unified_search("remimazolam ICU sedation", options="context_graph")

7️⃣ Pipeline (Reusable Search Plans)

# Save a template-based pipeline through the primary facade
manage_pipeline(
  action="save",
    name="icu_sedation_weekly",
    config="template: pico\nparams:\n  P: ICU patients\n  I: remimazolam\n  C: propofol\n  O: delirium",
    tags="anesthesia,sedation",
    description="Weekly ICU sedation monitoring"
)

# Save a custom DAG pipeline
manage_pipeline(
  action="save",
    name="brca1_comprehensive",
    config="""
steps:
  - id: expand
    action: expand
    params: { topic: BRCA1 breast cancer }
  - id: pubmed
    action: search
    params: { query: BRCA1, sources: pubmed, limit: 50 }
  - id: expanded
    action: search
    inputs: [expand]
    params: { strategy: mesh, sources: pubmed,openalex, limit: 50 }
  - id: merged
    action: merge
    inputs: [pubmed, expanded]
    params: { method: rrf }
  - id: enriched
    action: metrics
    inputs: [merged]
output:
  limit: 30
  ranking: quality
"""
)

# Execute a saved pipeline
unified_search(pipeline="saved:icu_sedation_weekly")

# List & manage
manage_pipeline(action="list", tag="anesthesia")
manage_pipeline(action="load", source="brca1_comprehensive")  # Review YAML
manage_pipeline(action="history", name="icu_sedation_weekly")  # View past runs

πŸ” Search Mode Comparison

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        SEARCH MODE DECISION TREE                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                          β”‚
β”‚   "What kind of search do I need?"                                       β”‚
β”‚         β”‚                                                                β”‚
β”‚         β”œβ”€β”€ Know exactly what to search?                                 β”‚
β”‚         β”‚   └── unified_search(query="topic keywords")                   β”‚
β”‚         β”‚       β†’ Quick, auto-routing to best sources                    β”‚
β”‚         β”‚                                                                β”‚
β”‚         β”œβ”€β”€ Have a clinical question (A vs B)?                           β”‚
β”‚         β”‚   └── Agent P/I/C/O β†’ parse_pico() handoff                  β”‚
β”‚         β”‚       β†’ unified_search(template:pico) or expanded Boolean    β”‚
β”‚         β”‚                                                                β”‚
β”‚         β”œβ”€β”€ Need comprehensive systematic coverage?                      β”‚
β”‚         β”‚   └── generate_search_queries() β†’ parallel search              β”‚
β”‚         β”‚       β†’ MeSH expansion, multiple strategies, merge             β”‚
β”‚         β”‚                                                                β”‚
β”‚         └── Exploring from a key paper?                                  β”‚
β”‚             └── find_related/citing/references β†’ build_citation_tree     β”‚
β”‚                 β†’ Citation network, research context                     β”‚
β”‚                                                                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Mode

Entry Point

Best For

Auto-Features

Quick

unified_search()

Fast topic search

ICD→MeSH, multi-source, dedup

PICO

Agent P/I/C/O -> parse_pico()

Clinical questions

Validate handoff -> template:pico backend search

Systematic

generate_search_queries()

Literature reviews

MeSH expansion, synonyms

Exploration

find_*_articles()

From key paper

Citation network, related


πŸ€– Claude Skills (AI Agent Workflows)

Pre-built workflow guides in .claude/skills/, divided into Usage Skills (for using the MCP server) and Development Skills (for maintaining the project):

πŸ“š Usage Skills (10) β€” For AI Agents Using This MCP Server

Skill

Description

pubmed-quick-search

Basic search with filters

pubmed-systematic-search

MeSH expansion, comprehensive

pubmed-pico-search

Clinical question decomposition

pubmed-paper-exploration

Citation tree, related articles

pubmed-gene-drug-research

Gene/PubChem/ClinVar

pubmed-fulltext-access

Europe PMC, CORE full text

pubmed-export-citations

RIS/BibTeX/CSV/CSL export guidance

pubmed-multi-source-search

Cross-database unified search

pubmed-mcp-tools-reference

Complete tool reference guide

pipeline-persistence

Save, load, reuse search plans

πŸ”§ Development Skills (13) β€” For Project Contributors

Skill

Description

changelog-updater

Auto-update CHANGELOG.md

code-refactor

DDD architecture refactoring

code-reviewer

Code quality & security review

ddd-architect

DDD scaffold for new features

git-doc-updater

Sync docs before commits

git-precommit

Pre-commit workflow orchestration

memory-checkpoint

Save context to Memory Bank

memory-updater

Update Memory Bank files

project-init

Initialize new projects

readme-i18n

Multilingual README sync

readme-updater

Sync README with code changes

roadmap-updater

Update ROADMAP.md status

test-generator

Generate test suites

πŸ“ Location: .claude/skills/*/SKILL.md (Claude Code-specific, and the single source of truth for repo skills) Do not mirror or split repo skills into .github/skills/. These repo skills are project-scoped and should remain version-controlled. Personal cross-project skills belong in a user directory such as ~/.copilot/skills/ or ~/.claude/skills/, not in this repository.


πŸ—οΈ Architecture (DDD)

This project uses Domain-Driven Design (DDD) architecture, with literature research domain knowledge as the core model.

src/pubmed_search/
β”œβ”€β”€ domain/                     # Core business logic
β”‚   └── entities/article.py     # UnifiedArticle, Author, etc.
β”œβ”€β”€ application/                # Use cases
β”‚   β”œβ”€β”€ search/                 # QueryAnalyzer, ResultAggregator
β”‚   β”œβ”€β”€ export/                 # Citation export (RIS, BibTeX...)
β”‚   └── session/                # SessionManager
β”œβ”€β”€ infrastructure/             # External systems
β”‚   β”œβ”€β”€ ncbi/                   # Entrez, iCite, Citation Exporter
β”‚   β”œβ”€β”€ sources/                # Europe PMC, CORE, CrossRef...
β”‚   └── http/                   # HTTP clients
β”œβ”€β”€ presentation/               # User interfaces
β”‚   β”œβ”€β”€ mcp_server/             # MCP tools, prompts, resources
β”‚   β”‚   └── tools/              # discovery, strategy, pico, export...
β”‚   └── api/                    # REST API (Copilot Studio)
└── shared/                     # Cross-cutting concerns
    β”œβ”€β”€ exceptions.py           # Unified error handling
    └── async_utils.py          # Rate limiter, retry, circuit breaker

Internal Mechanisms (Transparent to Agent)

Mechanism

Description

Session

Auto-create, auto-switch

Cache

Auto-cache search results, avoid duplicate API calls

Rate Limit

Auto-comply with NCBI API limits (0.34s/0.1s)

MeSH Lookup

generate_search_queries() auto-queries NCBI MeSH database

ESpell

Auto spelling correction (remifentanyl β†’ remifentanil)

Query Analysis

Each suggested query shows how PubMed actually interprets it

Vocabulary Translation Layer (Key Feature)

Our Core Value: We are the intelligent middleware between Agent and Search Engines, automatically handling vocabulary standardization so Agent doesn't need to know each database's terminology.

Different data sources use different controlled vocabulary systems. This server provides automatic conversion:

API / Database

Vocabulary System

Auto-Conversion

PubMed / NCBI

MeSH (Medical Subject Headings)

βœ… Full support via expand_with_mesh()

ICD Codes

ICD-10-CM / ICD-9-CM

βœ… Auto-detect & convert to MeSH

Europe PMC

Text-mined entities (Gene, Disease, Chemical)

βœ… get_text_mined_terms() extraction

OpenAlex

OpenAlex Concepts (deprecated)

❌ Free-text only

Semantic Scholar

S2 Field of Study

❌ Free-text only

CORE

None

❌ Free-text only

CrossRef

None

❌ Free-text only

Automatic ICD β†’ MeSH Conversion

When searching with ICD codes (e.g., I10 for Hypertension), unified_search() automatically:

  1. Detects ICD-10/ICD-9 patterns via detect_and_expand_icd_codes()

  2. Looks up corresponding MeSH terms from internal mapping (ICD10_TO_MESH, ICD9_TO_MESH)

  3. Expands query with MeSH synonyms for comprehensive search

# Agent calls unified_search with clinical terminology
unified_search(query="I10 treatment outcomes")

# Server auto-expands to PubMed-compatible query
"(I10 OR Hypertension[MeSH]) treatment outcomes"

πŸ“– Full architecture documentation: ARCHITECTURE.md

MeSH Auto-Expansion + Query Analysis

When calling generate_search_queries("remimazolam sedation"), internally it:

  1. ESpell Correction - Fix spelling errors

  2. MeSH Query - Entrez.esearch(db="mesh") to get standard vocabulary

  3. Synonym Extraction - Get synonyms from MeSH Entry Terms

  4. Query Analysis - Analyze how PubMed interprets each query

{
  "mesh_terms": [
    {
      "input": "remimazolam",
      "preferred": "remimazolam [Supplementary Concept]",
      "synonyms": ["CNS 7056", "ONO 2745"]
    }
  ],
  "all_synonyms": ["CNS 7056", "ONO 2745", ...],
  "suggested_queries": [
    {
      "id": "q1_title",
      "query": "(remimazolam sedation)[Title]",
      "purpose": "Exact title match - highest precision",
      "estimated_count": 8,
      "pubmed_translation": "\"remimazolam sedation\"[Title]"
    },
    {
      "id": "q3_and",
      "query": "(remimazolam AND sedation)",
      "purpose": "All keywords required",
      "estimated_count": 561,
      "pubmed_translation": "(\"remimazolam\"[Supplementary Concept] OR \"remimazolam\"[All Fields]) AND (\"sedate\"[All Fields] OR ...)"
    }
  ]
}

Value of Query Analysis: Agent thinks remimazolam AND sedation only searches these two words, but PubMed actually expands to Supplementary Concept + synonyms, results go from 8 to 561. This helps Agent understand the difference between intent and actual search.


πŸ”’ HTTPS Deployment

Enable HTTPS secure communication for production environments.

Copilot Studio Quick Start

# Step 1: Generate SSL certificates
./scripts/generate-ssl-certs.sh

# Step 2: Start HTTPS service (Docker)
./scripts/start-https-docker.sh up

# Verify deployment
curl -k https://localhost/

HTTPS Endpoints

Service

URL

Description

MCP

https://localhost/mcp

Streamable HTTP MCP endpoint

Health

https://localhost/health

Health check

Info

https://localhost/info

Runtime transport and endpoint metadata

Exports

https://localhost/exports

Prepared export file listing

Remote MCP Client Configuration

{
  "mcpServers": {
    "pubmed-search": {
      "url": "https://localhost/mcp"
    }
  }
}

🏒 Microsoft Copilot Studio Integration

Integrate PubMed Search MCP with Microsoft 365 Copilot (Word, Teams, Outlook)!

Quick Start

# Start with Streamable HTTP transport (required by Copilot Studio)
uv run python run_server.py --transport streamable-http --port 8765

# Enable Copilot-compatible HTTP semantics while keeping full tool schemas
uv run python run_server.py --transport streamable-http --copilot-compatible --port 8765

# Or use the dedicated script with ngrok
./scripts/start-copilot-studio.sh --with-ngrok

Copilot Studio Configuration

Field

Value

Server name

PubMed Search

Server URL

https://your-server.com/mcp

Authentication

None (or API Key)

πŸ“– Full documentation: copilot-studio/README.md

Use --copilot-compatible with run_server.py for Copilot HTTP semantics, or run_copilot.py if you also need simplified tool schemas.

⚠️ Note: SSE transport deprecated since Aug 2025. Use streamable-http.


πŸ“– More documentation:


πŸ” Security

Security Features

Layer

Feature

Description

HTTPS

TLS 1.2/1.3 encryption

All traffic encrypted via Nginx

Rate Limiting

30 req/s

Nginx level protection

Security Headers

XSS/CSRF protection

X-Frame-Options, X-Content-Type-Options

Streamable HTTP

/mcp endpoint

Modern MCP transport for remote clients

No Database

Stateless

No SQL injection risk

No Secrets

In-memory only

No credentials stored

See DEPLOYMENT.md for detailed deployment instructions.


πŸ“€ Export Formats

Export and local notes workflow

Export your search results in formats compatible with major reference managers:

Format

Source

Compatible With

Use Case

RIS

official or local

EndNote, Zotero, Mendeley

Universal import

MEDLINE

official or local

PubMed tools

Native PubMed-style archiving

CSL JSON

official

Citation processors

Programmatic citation styling

BibTeX

local

LaTeX, Overleaf, JabRef

Academic writing

CSV

local

Excel, Google Sheets

Data analysis

JSON

local

Programmatic access

Custom processing

Exported Fields

  • Core: PMID, Title, Authors, Journal, Year, Volume, Issue, Pages

  • Identifiers: DOI, PMC ID, ISSN

  • Content: Abstract (HTML tags cleaned)

  • Metadata: Language, Publication Type, Keywords

  • Access: DOI URL, PMC URL, Full-text availability

Special Character Handling

  • BibTeX exports use pylatexenc for proper LaTeX encoding

  • Nordic characters (ΓΈ, Γ¦, Γ₯), umlauts (ΓΌ, ΓΆ, Γ€), and accents are correctly converted

  • Example: SΓΈren Hansen β†’ S{\o}ren Hansen


πŸ“š Citation

GitHub will show Cite this repository from CITATION.cff. If you use PubMed Search MCP in research, methods sections, or internal technical reports, prefer the GitHub-generated citation or reuse the repository metadata directly.

@software{pubmed_search_mcp,
  title = {PubMed Search MCP},
  author = {u9401066},
  url = {https://github.com/u9401066/pubmed-search-mcp}
}

πŸ“„ License

Apache License 2.0 - see LICENSE


A
license - permissive license
-
quality - not tested
B
maintenance

Maintenance

–Maintainers
6dResponse time
1dRelease cycle
5Releases (12mo)
Issues opened vs closed

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/u9401066/pubmed-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server