Skip to main content
Glama

paperSearch

中文文档

paperSearch is a local stdio MCP server that gives AI coding agents — Claude Code, OpenCode, and Codex — structured, citable access to academic literature. It wraps Semantic Scholar (paper search, citations, references, recommendations) and PubTator3 (biomedical entity normalization, entity relations, PMID evidence) into one cached, reproducible toolset.

Use cases: literature review automation, systematic review support, citation graph analysis, gene–drug–disease relation mining, biomedical evidence retrieval, paper recommendation, and research question answering inside your agent workflow.

Keywords: MCP server, academic search, paper search, Semantic Scholar, PubTator3, PubMed, citation graph, biomedical NLP, entity relation, literature review, bioinformatics, immuno-oncology, single-cell, immunotherapy, research assistant, AI agent tools.


What it does

  • Search academic papers across Semantic Scholar with filters for year, venue, citation count, open access, and sort order.

  • Resolve biomedical entities to canonical PubTator3 IDs: @GENE_JAK1, @CHEMICAL_Pembrolizumab, @DISEASE_Neoplasms.

  • Mine entity relations such as treat, inhibit, cause, associate, interact, with links back to PubMed evidence.

  • Traverse citation graphs (citations/references) and get paper recommendations from examples.

  • Cache everything locally in SQLite so repeated queries are fast and reproducible.


Related MCP server: Research MCP

Features

Semantic Scholar tools

  • s2_search_papers – general paper search with year, venue, citation count, and sorting filters

  • s2_get_paper – single paper metadata (supports S2 paperId, DOI, PMID, arXiv)

  • s2_batch_get_papers – batch metadata lookup (max 100)

  • s2_get_citations / s2_get_references – citation graph traversal

  • s2_recommend_papers – recommendation from positive/negative examples

PubTator3 tools

  • pubtator_find_entity – normalize biomedical entities (@GENE_JAK1, @CHEMICAL_Pembrolizumab, …)

  • pubtator_find_relations – entity–entity relations with PMID evidence counts

  • pubtator_search – free-text or relation-query search

Composite helpers

  • biomed_relation_evidence – one-shot resolver from plain names to entity IDs, relations, and PMIDs

  • paper_enrich_with_pubtator – enrich a Semantic Scholar paper with PubTator annotations

  • research_query – unified literature search with optional PubTator enrichment


Quick Start

git clone <repo-url> ~/Publish/paperSearch
cd ~/Publish/paperSearch
uv sync --extra dev

Create your local config file:

mkdir -p ~/.config/paperSearch
cp config.example.json ~/.config/paperSearch/config.json
# edit ~/.config/paperSearch/config.json and add your Semantic Scholar API key

Run the smoke test:

uv run python scripts/smoke_test.py

Configuration

paperSearch reads a user-level JSON config file, then lets environment variables override it.

Path: ~/.config/paperSearch/config.json

{
  "semantic_scholar_api_key": "YOUR_SEMANTIC_SCHOLAR_API_KEY",
  "cache_path": "/tmp/paper_search_cache.db"
}

Key

Required

Default

Description

semantic_scholar_api_key

recommended

See API tokens below

cache_path

no

/tmp/paper_search_cache.db

Local SQLite cache file

Environment variables

These override the config file if set:

Variable

Description

SEMANTIC_SCHOLAR_API_KEY

Semantic Scholar API key

PAPER_SEARCH_CACHE

SQLite cache file path

An .env.example template is included for users who prefer dotenv-style setup. Do not commit .env or config.json. Both are already ignored in .gitignore.


API Tokens

Semantic Scholar API Key

  • Purpose: Authenticates requests to the Semantic Scholar Academic Graph API. Using a key raises rate limits compared to anonymous access.

  • Source: https://www.semanticscholar.org/product/api

  • Required? No. The server works without a key, but you will hit stricter anonymous rate limits.

  • How to provide: Write it in ~/.config/paperSearch/config.json under semantic_scholar_api_key, or export SEMANTIC_SCHOLAR_API_KEY.

PubTator3

  • Purpose: NCBI PubTator3 provides biomedical entity normalization, relation mining, and article search.

  • Source: https://www.ncbi.nlm.nih.gov/research/pubtator3/

  • Required? No token is required; PubTator3 is a public NCBI service. Respect NCBI rate-limit guidelines.


Agent Setup

Claude Code

Edit ~/.claude/settings.json (global) or ~/.claude.json (project-level) and add an entry under mcpServers:

{
  "mcpServers": {
    "paperSearch": {
      "command": "uv",
      "args": [
        "--project",
        "/path/to/paperSearch",
        "run",
        "python",
        "-m",
        "paper_search.server"
      ],
      "env": {}
    }
  }
}

The API key is read from ~/.config/paperSearch/config.json, so you do not need to put it in the Claude config.

Codex

Edit ~/.codex/.mcp.json:

{
  "mcpServers": {
    "paperSearch": {
      "command": "uv",
      "args": [
        "--project",
        "/path/to/paperSearch",
        "run",
        "python",
        "-m",
        "paper_search.server"
      ],
      "env": {}
    }
  }
}

Or, if you use ~/.codex/config.toml:

[mcp_servers.paperSearch]
type = "stdio"
command = "uv"
args = ["--project", "/path/to/paperSearch", "run", "python", "-m", "paper_search.server"]

[mcp_servers.paperSearch.env]

OpenCode

Edit ~/.config/opencode/opencode.json (global) or create opencode.json in your project root:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "paperSearch": {
      "type": "local",
      "command": [
        "uv",
        "--project",
        "/path/to/paperSearch",
        "run",
        "python",
        "-m",
        "paper_search.server"
      ],
      "environment": {},
      "enabled": true
    }
  }
}

Tool Reference

All tools return a normalized envelope:

{
  "source": "semantic_scholar|pubtator3|composite",
  "query": ...,
  "retrieved_at": "2026-06-30T12:00:00Z",
  "cache_key": "sha256...",
  "from_cache": false,
  "raw_ids": [...],
  "items": [...],
  "warnings": []
}

Semantic Scholar

Tool

Description

s2_search_papers

General paper search

s2_get_paper

Single paper details

s2_batch_get_papers

Batch metadata lookup (max 100)

s2_get_citations

Papers citing a given paper

s2_get_references

Papers referenced by a given paper

s2_recommend_papers

Recommendations from example papers

PubTator3

Tool

Description

pubtator_find_entity

Entity autocomplete/normalization

pubtator_find_relations

Biomedical relations between entities

pubtator_search

Article/PMID search

Composite

Tool

Description

biomed_relation_evidence

Resolve names → entity IDs → relations → PMIDs

paper_enrich_with_pubtator

Enrich a paper with PubTator annotations

research_query

Unified search with optional PubTator enrichment


Most LLM literature searches are shallow web lookups. paperSearch gives agents:

  • Structured metadata: DOI, PMID, citation counts, venues, authors, open-access PDFs.

  • Entity-level reasoning: genes, chemicals, diseases, variants from PubTator3.

  • Citable evidence: every relation is backed by PubMed PMIDs.

  • Local caching: SQLite-backed TTL cache reduces API calls and keeps results reproducible.

  • Agent-native: stdio MCP means it works out of the box with Claude Code, OpenCode, Codex, and any other MCP client.


Development

uv sync --extra dev
uv run ruff check .
uv run pytest -q
uv run python scripts/smoke_test.py

License

MIT

A
license - permissive license
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/FreddieWho/searchPaper'

If you have feedback or need assistance with the MCP directory API, please join our Discord server