plant-genomics-mcp
🌱 plant-genomics-mcp
32 tools for plant-genomics locus lookup over the Model Context Protocol — 16 single-locus + 12 parallel-batch + 4 cross-source synthesis variants. Free, public sources: Ensembl Plants, Phytozome BioMart, UniProtKB, Europe PMC, QuickGO, NCBI BLAST, Gramene, KEGG, STRING-DB, ATTED-II, and BAR (Bio-Analytic Resource for Plant Biology).
📦 Install
pipx install plant-genomics-mcp
claude mcp add plant-genomics --scope local -- plant-genomics-mcp# GHCR Docker image
docker pull ghcr.io/musharna/plant-genomics-mcp:latest
claude mcp add plant-genomics --scope local -- \
docker run --rm -i ghcr.io/musharna/plant-genomics-mcp:latest
# From source
git clone https://github.com/musharna/plant-genomics-mcp.git
cd plant-genomics-mcp
python -m venv .venv && .venv/bin/pip install -e .
claude mcp add plant-genomics --scope local -- "$(pwd)/.venv/bin/plant-genomics-mcp"🛠️ Tools
32 tools across 11 backends — Ensembl Plants, Phytozome BioMart,
UniProtKB, Europe PMC, QuickGO, NCBI BLAST, Gramene, KEGG, STRING-DB,
ATTED-II, BAR. 16 single-locus + 12 parallel-batch + 4 cross-source
synthesis. All take a TAIR-style locus (e.g. AT1G01010) plus
optional organism= (slug / scientific name / common name / NCBI taxid
— 12-plant curated coverage matrix at the pgmcp://organisms/coverage
MCP resource). All publish JSON outputSchema and EDAM ontology tags.
# | Category | Tool | What it does |
1 | Gene metadata (live) |
| Fetches gene record from Ensembl Plants REST (any plant species). |
2 | Cross-references (live) |
| Fetches cross-DB references (UniProt, NCBI Gene, TAIR, GO, …) from Ensembl. |
3 | Gene metadata (live) |
| Fetches gene record from Phytozome BioMart (any Phytozome proteome). |
4 | Protein (live) |
| Resolves a locus to its UniProtKB record (Swiss-Prot preferred, TrEMBL OK). |
5 | Literature (live) |
| Searches Europe PMC for papers mentioning the locus (free, no API key). |
6 | GO annotations (live) |
| Fetches QuickGO GO annotations (locus → UniProt → QuickGO). |
7 | Sequence search (live) |
| NCBI BLAST URLAPI — async Put/Get polling with progress notifications. |
8 | Homology (live) |
| Fetches Gramene v69 homology entries (ortholog / paralog) with gene_tree_id. |
9 | Pathways (live) |
| Fetches KEGG pathway memberships. 7 organisms: Arabidopsis ( |
10 | Interactions (live) |
| Fetches STRING-DB first-neighbor interaction partners with per-channel score. |
11 | Coexpression (live) |
| Fetches ATTED-II Ath-u.c4-0 top-N coexpression neighbors with z-scores. |
12 | Curator summary (live) |
| Fetches BAR ThaleMine + GAIA-aliases curator summary for an Arabidopsis locus. |
13 | Expression (live) |
| Fetches BAR eFP-Browser expression profile (mean ± SD per tissue) for a locus. |
14 | Interactions (live) |
| Fetches BAR AIV interaction partners (Arabidopsis + rice) with confidence + papers. |
15 | Curator summary (live) |
| Silent upgrade — alias of |
16 | Subscription redirect |
| Returns subscription notice + redirect to live backends. No upstream call. |
17 | Batch (live) |
| Parallel per-locus fanout for tools 1–6, 8–12, 14. Up to 50 loci per call. |
18 | Synthesis (live) |
| Compose 2–5 backends in parallel, return a |
⚡ Quickstart
After install, the simplest call returns the Ensembl Plants record for
NAC001 — the canonical worked example used throughout examples/:
// arguments
{ "locus": "AT1G01010" }
// result (truncated)
{
"id": "AT1G01010",
"organism": "arabidopsis_thaliana",
"display_name": "NAC001",
"biotype": "protein_coding",
"seq_region_name": "1",
"start": 3631,
"end": 5899,
"strand": 1,
"assembly_name": "TAIR10",
"description": "NAC domain containing protein 1 ..."
}Cross-species — pass organism=:
{ "locus": "Os01g0100100", "organism": "oryza_sativa" }In Claude Code, the same prompt fans out across Ensembl, UniProtKB, and Europe PMC in a single turn (animated demo):
Full per-tool walkthroughs (with real upstream-API transcripts) live in
examples/:
Walkthrough | Coverage |
Ensembl → xrefs → UniProt → Europe PMC → QuickGO chain (5 tools). | |
BLAST + per-hit UniProt enrichment. | |
Gramene + KEGG + UniProt + STRING + ATTED-II (5 tools). | |
All 4 v0.8 synthesis tools ( | |
v0.9 multi-organism resolver against rice + maize — per-backend routing on PyPI v1.0.4. |
📚 Resources & prompts
Clients discover them via resources/list and prompts/list.
Resources (resources/read):
URI | What |
| Per-backend |
| Slug → Phytozome |
| Per-backend liveness rollup — |
| Markdown table of all 12 supported plants × 5 ID slots (ncbi_taxid / ensembl / phytozome / …). |
Prompts (prompts/get):
Name | Required | Optional | Chains |
|
|
| Ensembl → xrefs → UniProt → Europe PMC → QuickGO. |
|
|
|
|
|
|
| Gramene → KEGG → UniProt → STRING → ATTED-II. |
🔌 Transports
Transport | How to launch |
stdio (default) |
|
streamable-HTTP |
|
The HTTP transport is stateless and emits JSON responses by default — the right shape for registry indexers and remote hosting.
Hosted endpoint
A small personal demo runs at:
https://mjarnoldgt76.tail86d19d.ts.net/mcpIntended for registry indexers, one-off evaluation, and quick interactive testing — not for production workloads. No SLA, no uptime commitment, URL may change without notice (single laptop on a residential connection).
# liveness probe
curl https://mjarnoldgt76.tail86d19d.ts.net/healthz
# {"status":"ok"}
# connect from Claude Code
claude mcp add --transport http plant-genomics-mcp \
https://mjarnoldgt76.tail86d19d.ts.net/mcpFor anything beyond casual evaluation, self-host. The HTTP transport
is the same binary; self-hosting buys deterministic uptime, your own
bearer-token gate (PLANT_GENOMICS_MCP_HTTP_TOKEN), and NCBI BLAST
etiquette under your own contact email.
⚙️ Configuration
Stdio needs no configuration. The two env vars that matter:
Variable | When | Effect |
| HTTP transport only | Bearer token for |
| If you use BLAST | NCBI etiquette contact. Unset → placeholder + per-call warning; NCBI may throttle. |
Variable | Default | Effect |
|
| HTTP bind address. |
|
| HTTP TCP port. |
|
| Reject POSTs with |
|
|
|
|
|
|
|
| Max in-flight BLAST searches per process (NCBI per-IP rate limit). |
|
| Per-backend TTL+LRU cache entry lifetime, in seconds. 200-only. |
|
| Max entries per backend before LRU eviction. |
| unset | Any non-empty value makes every cache a no-op. |
The cache is process-local — restart the server to drop all entries.
Long-running calls (retry storms, multi-second Phytozome BioMart POSTs)
emit MCP notifications/progress over the active session; clients opt
in via progressToken in the request _meta.
⚠️ Error model
All live tools raise PlantGenomicsError subclasses; the MCP SDK
stringifies them into the wire content with a [ClassName] prefix so
clients can route on failure kind without parsing the message:
Wire prefix | When |
| 404 / empty BioMart row / invalid locus identifier |
| 429 retry budget exhausted — back off and retry |
| 5xx past retry budget — service outage, try a peer backend |
| Other (BioMart |
Batch tools return {tool, count, results, errors} where
results[locus] is the same shape as the single-locus tool and
errors[locus] is the same [ClassName] message string. Ensembl's
batch uses the native POST /lookup/id endpoint (one HTTP round-trip);
everything else fans out via asyncio.gather.
🧪 Development
.venv/bin/pip install -e '.[dev]'
.venv/bin/pytest -q # unit tests
PLANT_GENOMICS_MCP_LIVE=1 .venv/bin/pytest -q # adds live network probes
PLANT_GENOMICS_MCP_STDIO_SMOKE=1 .venv/bin/pytest -q # adds stdio smoke
.venv/bin/ruff check .CI runs the unit suite + the stdio smoke on every push/PR (matrix: Python 3.11, 3.12). The live-network gate is not run in CI to avoid flakes from upstream availability.
Scientific validation / drift detection. scripts/benchmark_annotations.py
drives a curated corpus of canonical loci (27, spanning all 12 organisms)
through every backend + synthesis pipeline and compares results to a frozen
baseline, emitting PASS / DRIFT / FAIL plus cross-source consistency
invariants. It's how upstream data drift is caught. A scheduled GitHub Actions
workflow (.github/workflows/benchmark.yml) runs it weekly and pages on a
confirmed regression. Operator guide: docs/benchmarking.md.
.venv/bin/python scripts/benchmark_annotations.py # full live sweep (~3-5 min)See CHANGELOG.md for release notes, including the
v0.8 → v0.9 species=/organism_id= → organism= migration and the
v1.0.1 HTTP-token enforcement change.
MCP registry
Listed in the official MCP registry
under the namespace below (ownership-verification token for mcp-publisher):
mcp-name: io.github.musharna/plant-genomics-mcpLicense
MIT — see LICENSE. Underlying services (Ensembl Plants,
Phytozome, TAIR, PlantCyc, BAR) have their own terms of use; consult
each before bulk querying.
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/musharna/plant-genomics-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server