Skip to main content
Glama
cyanheads

protein-mcp-server

Version License Docker MCP SDK npm TypeScript Bun

Install in Claude Desktop Install in Cursor Install in VS Code

Framework

Public Hosted Server: https://protein.caseyjhand.com/mcp


Tools

Seven tools spanning the structure-research arc — discover, fetch, find homologs, track ligands, compare, profile the corpus, and annotate — over experimental (PDB) and predicted (AlphaFold) structures from one surface:

Tool

Description

protein_search_structures

Search experimental and predicted structures by free text, sequence, or organism/method/resolution filters, with optional facet breakdowns.

protein_get_structure

Fetch metadata and coordinate-file URLs by ID — experimental (PDB), predicted (AlphaFold), or best-available — with batch partial success and optional coordinate inlining.

protein_find_similar

Find sequence homologs (RCSB mmseqs2) or fold homologs (Foldseek) from a sequence, PDB ID, or UniProt accession.

protein_track_ligands

Resolve ligand names/formulas to component IDs, find structures containing a ligand, or map binding-site residues.

protein_compare_structures

Structurally align 2–10 structures (TM-align / jFATCAT) to a reference or as a full pairwise matrix.

protein_analyze_collection

Profile the PDB into distributions and trends with server-side facets — counts, histograms, timelines, and cross-tabs.

protein_get_annotations

Fetch UniProt features and natural variants plus InterPro domain/family memberships with GO terms.

protein_search_structures

Federated search across experimental (PDB) and predicted (computed-model) structures via RCSB Search v2.

  • Free-text, protein-sequence (triggers an mmseqs2 similarity search), and organism / method / resolution filters

  • content_type scopes the search to experimental, predicted, or all

  • Experimental hits are enriched with title, method, resolution, and organism

  • Optional facets return a method / organism / release-year breakdown alongside the hits at no extra call

  • Chain hit IDs straight into protein_get_structure


protein_get_structure

Fetch structures with metadata and coordinate-file URLs, resolving across providers by source.

  • source: experimental takes PDB entry IDs, batched in one RCSB GraphQL call

  • source: predicted takes UniProt accessions and returns the AlphaFold model with pLDDT/PAE confidence

  • source: best_available takes UniProt accessions and returns the top federated model (experimental if one exists, else the best prediction)

  • Per-ID partial success — unresolved IDs are listed in failed[], not a batch-level error

  • include_coords inlines coordinate content; when a batch overflows the response budget it returns a per-structure size outline, so you can re-call with sections: [ids] for specific structures


protein_find_similar

Find structurally or evolutionarily related proteins, by sequence or by fold.

  • by: sequence runs a synchronous RCSB mmseqs2 search; by: structure runs an asynchronous Foldseek search against experimental and predicted databases

  • Query from a raw one-letter sequence, a PDB ID, or a UniProt accession

  • Foldseek targets default to pdb100 + afdb50; override via databases (e.g. afdb-swissprot, BFVD)

  • Async jobs that exceed the poll budget return status: computing with a ticket — re-call to resume

  • Each hit names the engine and source database it came from


protein_track_ligands

Ligand discovery and binding-site analysis across the PDB.

  • mode: find_ligand resolves a name or formula to chemical component IDs with formula, weight, SMILES, and InChIKey

  • mode: structures_with_ligand returns PDB entries containing a ligand by exact component ID

  • mode: binding_site returns the protein residues lining a ligand's pocket in a structure, with contact distances

  • Binding sites are experimental-only — computed from deposited coordinates (predicted models carry no bound ligands)


protein_compare_structures

Structural alignment of 2–10 structures via the RCSB Structural Comparison service.

  • Methods: tm-align, fatcat-rigid, fatcat-flexible

  • reference: first aligns every structure to the first; reference: all_pairs computes the full pairwise matrix

  • Optional per-structure chain restricts the alignment to a single chain

  • Each pair is an independent async job, fanned out with a concurrency cap and per-pair partial success — a pair still computing when the budget elapses returns status: computing with its job UUID, and a failed pair degrades its row without sinking the others

  • Returns TM-score, RMSD, and aligned-residue count per pair


protein_analyze_collection

Profile the PDB into distributions and trends over an optional scoping query — backed by RCSB's server-side facet engine (one call, compact buckets, no row pull).

  • Group by method, organism, polymer_type, resolution, release_year, or molecular_weight

  • One group_by dimension for a breakdown, or two for a cross-tab (the first nests the second)

  • interval sets the bin width for value histograms or the period for date histograms (year / month / quarter)

  • Scope with a free-text query, organism, method, or max_resolution; content_type selects the structure universe

  • bucket_limit caps buckets per dimension; truncation is flagged in the response


protein_get_annotations

Sequence and functional annotation for a protein.

  • UniProt features (domains, binding sites, PTMs) and natural sequence variants

  • InterPro domain/family memberships (Pfam, PROSITE, …) with associated GO terms

  • Provide a UniProt accession directly, or a PDB ID — resolved to its accession via the structure's sequence cross-reference

  • include scopes which annotation classes are fetched: features, domains, variants, or all

Related MCP server: STRING-db MCP Server

Resources

Type

Name

Description

Resource

pdb://{entry_id}

Experimental structure summary for a PDB entry — title, method, resolution, organism, chains, and bound ligands.

Resource

af://{uniprot}

Predicted-structure summary for a UniProt accession from AlphaFold DB — mean pLDDT, confidence-band fractions, model URLs, and version.

All resource data is also reachable via tools — pdb://{entry_id} mirrors protein_get_structure for source: experimental, and af://{uniprot} mirrors it for source: predicted. Many MCP clients are tool-only and don't surface resources; the summaries remain reachable through the tools.

Features

Built on @cyanheads/mcp-ts-core:

  • Declarative tool and resource definitions — single file per primitive, framework handles registration and validation

  • Unified error handling — handlers throw, framework catches, classifies, and formats

  • Pluggable auth: none, jwt, oauth

  • Swappable storage backends: in-memory, filesystem, Supabase, Cloudflare KV/R2/D1

  • Structured logging with optional OpenTelemetry tracing

  • STDIO and Streamable HTTP transports

Protein-specific:

  • One federated surface over experimental (PDB) and predicted (AlphaFold / 3D-Beacons) structures — search, fetch, and compare treat both universes the same

  • Keyless across every upstream — RCSB, AlphaFold DB, 3D-Beacons, UniProt, InterPro, and Foldseek, no API keys to provision

  • Corpus analytics run server-side on RCSB's facet engine — distributions, histograms, and cross-tabs in one call, no row pull and no SQL workspace

  • Async alignment and Foldseek jobs poll within a bounded budget and hand back a resumable ticket instead of blocking

Agent-friendly output:

  • Provenance on every response — each hit carries a source (experimental / predicted), the engine and database that produced it, and effective-query / total-count echoes so agents can reason about coverage

  • Graceful partial failure — batch fetches and pairwise comparisons return per-item rows (failed[], per-pair status) instead of failing the whole request, each with actionable recovery text

  • Discriminated output contracts — typed source and status unions, computing results with resume tickets, and budget-overflow outlines let callers branch on data, not string parsing

Getting started

Public Hosted Instance

A public instance is available at https://protein.caseyjhand.com/mcp — no installation required. Point any MCP client at it via Streamable HTTP:

{
  "mcpServers": {
    "protein": {
      "type": "streamable-http",
      "url": "https://protein.caseyjhand.com/mcp"
    }
  }
}

Self-hosted

Add the following to your MCP client configuration file. No API key is required — every upstream provider is keyless.

{
  "mcpServers": {
    "protein-mcp-server": {
      "type": "stdio",
      "command": "bunx",
      "args": ["@cyanheads/protein-mcp-server@latest"],
      "env": {
        "MCP_TRANSPORT_TYPE": "stdio",
        "MCP_LOG_LEVEL": "info"
      }
    }
  }
}

Or with npx (no Bun required):

{
  "mcpServers": {
    "protein-mcp-server": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@cyanheads/protein-mcp-server@latest"],
      "env": {
        "MCP_TRANSPORT_TYPE": "stdio",
        "MCP_LOG_LEVEL": "info"
      }
    }
  }
}

Or with Docker:

{
  "mcpServers": {
    "protein-mcp-server": {
      "type": "stdio",
      "command": "docker",
      "args": ["run", "-i", "--rm", "-e", "MCP_TRANSPORT_TYPE=stdio", "ghcr.io/cyanheads/protein-mcp-server:latest"]
    }
  }
}

For Streamable HTTP, set the transport and start the server:

MCP_TRANSPORT_TYPE=http MCP_HTTP_PORT=3010 bun run start:http
# Server listens at http://localhost:3010/mcp

Prerequisites

  • Bun v1.3.2 or higher (or Node.js v24+).

  • No accounts or API keys — RCSB, AlphaFold DB, 3D-Beacons, UniProt, InterPro, and Foldseek are all public and keyless.

Installation

  1. Clone the repository:

git clone https://github.com/cyanheads/protein-mcp-server.git
  1. Navigate into the directory:

cd protein-mcp-server
  1. Install dependencies:

bun install

Configuration

All upstream providers are keyless, so the server runs out of the box with no configuration. Every variable below is optional.

Variable

Description

Default

PROTEIN_ASYNC_POLL_TIMEOUT_MS

Max wall-clock to poll an async job (alignment / Foldseek) before returning a computing result.

30000

PROTEIN_MAX_BATCH_IDS

Cap on IDs accepted by protein_get_structure in one batch (1–100).

25

PROTEIN_MAX_COMPARE_STRUCTURES

Cap on structures per protein_compare_structures call (2–25).

10

PROTEIN_FACET_BUCKET_CAP

Default cap on buckets per protein_analyze_collection dimension (1–500).

50

PROTEIN_FANOUT_CONCURRENCY

Max concurrent upstream requests for per-ID / per-pair fan-out (1–16).

5

RCSB_SEARCH_BASE_URL

Base URL for the RCSB Search API v2.

https://search.rcsb.org

ALPHAFOLD_BASE_URL

Base URL for the AlphaFold Protein Structure Database API.

https://alphafold.ebi.ac.uk

FOLDSEEK_BASE_URL

Base URL for the Foldseek structural-similarity search service.

https://search.foldseek.com

MCP_TRANSPORT_TYPE

Transport: stdio or http.

stdio

MCP_HTTP_PORT

Port for the HTTP server.

3010

MCP_AUTH_MODE

Auth mode: none, jwt, or oauth.

none

MCP_LOG_LEVEL

Log level (RFC 5424).

info

OTEL_ENABLED

Enable OpenTelemetry instrumentation.

false

See .env.example for the full list of provider base-URL overrides and tuning limits.

Running the server

Local development

  • Build and run:

    # One-time build
    bun run rebuild
    
    # Run the built server
    bun run start:stdio
    # or
    bun run start:http
  • Run checks and tests:

    bun run devcheck   # Lint, format, typecheck, security
    bun run test       # Vitest test suite
    bun run lint:mcp   # Validate MCP definitions against spec

Docker

docker build -t protein-mcp-server .
docker run --rm -e MCP_TRANSPORT_TYPE=http -p 3010:3010 protein-mcp-server

The Dockerfile defaults to HTTP transport, stateless session mode, and logs to /var/log/protein-mcp-server. OpenTelemetry peer dependencies are installed by default — build with --build-arg OTEL_ENABLED=false to omit them.

Project structure

Directory

Purpose

src/index.ts

createApp() entry point — registers tools/resources and inits the provider services.

src/config

Server-specific environment variable parsing and validation with Zod.

src/mcp-server/tools

Tool definitions (*.tool.ts).

src/mcp-server/resources

Resource definitions (*.resource.ts).

src/services

Provider service layer — RCSB, AlphaFold, 3D-Beacons, UniProt, InterPro, Foldseek, and shared HTTP/identifier helpers.

tests/

Unit and integration tests mirroring src/.

Development guide

See CLAUDE.md/AGENTS.md for development guidelines and architectural rules. The short version:

  • Handlers throw, framework catches — no try/catch in tool logic

  • Use ctx.log for request-scoped logging, ctx.state for tenant-scoped storage

  • Register new tools and resources via the barrels in src/mcp-server/*/definitions/index.ts

  • Wrap external API calls: validate raw → normalize to domain type → return output schema; never fabricate missing fields

Contributing

Issues and pull requests are welcome. Run checks and tests before submitting:

bun run devcheck
bun run test

License

Apache-2.0 — see LICENSE for details.

A
license - permissive license
-
quality - not tested
A
maintenance

Maintenance

Maintainers
1hResponse time
2dRelease cycle
5Releases (12mo)
Commit activity
Issues opened vs closed

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cyanheads/protein-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server