Skip to main content
Glama
BIOMCP_DATA_FLOW.md7.6 kB
# BioMCP Data Flow Diagram This document illustrates how BioMCP (Biomedical Model Context Protocol) works, showing the interaction between AI clients, the MCP server, domains, and external data sources. ## High-Level Architecture ```mermaid graph TB subgraph "AI Client Layer" AI[AI Assistant<br/>e.g., Claude, GPT] end subgraph "MCP Server Layer" MCP[MCP Server<br/>router.py] SEARCH[search tool] FETCH[fetch tool] end subgraph "Domain Routing Layer" ROUTER[Query Router] PARSER[Query Parser] UNIFIED[Unified Query<br/>Language] end subgraph "Domain Handlers" ARTICLES[Articles Domain<br/>Handler] TRIALS[Trials Domain<br/>Handler] VARIANTS[Variants Domain<br/>Handler] THINKING[Thinking Domain<br/>Handler] end subgraph "External APIs" subgraph "Article Sources" PUBMED[PubTator3/<br/>PubMed] BIORXIV[bioRxiv/<br/>medRxiv] EUROPEPMC[Europe PMC] end subgraph "Clinical Data" CLINICALTRIALS[ClinicalTrials.gov] end subgraph "Variant Sources" MYVARIANT[MyVariant.info] TCGA[TCGA] KG[1000 Genomes] CBIO[cBioPortal] end end %% Connections AI -->|MCP Protocol| MCP MCP --> SEARCH MCP --> FETCH SEARCH --> ROUTER ROUTER --> PARSER PARSER --> UNIFIED ROUTER --> ARTICLES ROUTER --> TRIALS ROUTER --> VARIANTS ROUTER --> THINKING ARTICLES --> PUBMED ARTICLES --> BIORXIV ARTICLES --> EUROPEPMC ARTICLES -.->|Gene enrichment| CBIO TRIALS --> CLINICALTRIALS VARIANTS --> MYVARIANT MYVARIANT --> TCGA MYVARIANT --> KG VARIANTS --> CBIO THINKING -->|Internal| THINKING classDef clientClass fill:#e1f5fe,stroke:#01579b,stroke-width:2px classDef serverClass fill:#f3e5f5,stroke:#4a148c,stroke-width:2px classDef domainClass fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px classDef apiClass fill:#fff3e0,stroke:#e65100,stroke-width:2px class AI clientClass class MCP,SEARCH,FETCH serverClass class ARTICLES,TRIALS,VARIANTS,THINKING domainClass class PUBMED,BIORXIV,EUROPEPMC,CLINICALTRIALS,MYVARIANT,TCGA,KG,CBIO apiClass ``` ## Detailed Search Flow ```mermaid sequenceDiagram participant AI as AI Client participant MCP as MCP Server participant Router as Query Router participant Domain as Domain Handler participant API as External API AI->>MCP: search(query="gene:BRAF AND disease:melanoma") MCP->>Router: Parse & route query alt Unified Query Router->>Router: Parse field syntax Router->>Router: Create routing plan par Search Articles Router->>Domain: Search articles (BRAF, melanoma) Domain->>API: PubTator3 API call API-->>Domain: Article results Domain->>API: cBioPortal enrichment API-->>Domain: Mutation data and Search Trials Router->>Domain: Search trials (melanoma) Domain->>API: ClinicalTrials.gov API API-->>Domain: Trial results and Search Variants Router->>Domain: Search variants (BRAF) Domain->>API: MyVariant.info API API-->>Domain: Variant results end else Domain-specific Router->>Domain: Direct domain search Domain->>API: Single API call API-->>Domain: Domain results else Sequential Thinking Router->>Domain: Process thought Domain->>Domain: Update session state Domain-->>Router: Thought response end Domain-->>Router: Formatted results Router-->>MCP: Aggregated results MCP-->>AI: Standardized response ``` ## Search Tool Parameters ```mermaid graph LR subgraph "Search Tool Input" PARAMS[Parameters] QUERY[query: string] DOMAIN[domain: article/trial/variant/thinking] GENES[genes: list] DISEASES[diseases: list] CONDITIONS[conditions: list] LAT[lat/long: coordinates] THOUGHT[thought parameters] end subgraph "Search Modes" MODE1[Unified Query Mode<br/>Uses 'query' param] MODE2[Domain-Specific Mode<br/>Uses domain + params] MODE3[Thinking Mode<br/>Uses thought params] end PARAMS --> MODE1 PARAMS --> MODE2 PARAMS --> MODE3 ``` ## Domain-Specific Data Sources ```mermaid graph TD subgraph "Articles Domain" A1[PubTator3/PubMed<br/>- Published articles<br/>- Annotations] A2[bioRxiv/medRxiv<br/>- Preprints<br/>- Early research] A3[Europe PMC<br/>- Open access<br/>- Full text] A4[cBioPortal Integration<br/>- Auto-enrichment when genes specified<br/>- Mutation summaries & hotspots] end subgraph "Trials Domain" T1[ClinicalTrials.gov<br/>- Active trials<br/>- Trial details<br/>- Location search] end subgraph "Variants Domain" V1[MyVariant.info<br/>- Variant annotations<br/>- Clinical significance] V2[TCGA<br/>- Cancer variants<br/>- Somatic mutations] V3[1000 Genomes<br/>- Population frequency<br/>- Allele data] V4[cBioPortal<br/>- Cancer mutations<br/>- Hotspots] end A1 -.->|When genes present| A4 A2 -.->|When genes present| A4 A3 -.->|When genes present| A4 ``` ## Unified Query Language ```mermaid graph TD QUERY[Unified Query<br/>"gene:BRAF AND disease:melanoma"] QUERY --> PARSE[Query Parser] PARSE --> F1[Field: gene<br/>Value: BRAF] PARSE --> F2[Field: disease<br/>Value: melanoma] F1 --> D1[Articles Domain] F1 --> D2[Variants Domain] F2 --> D1 F2 --> D3[Trials Domain] D1 --> R1[PubMed Results] D2 --> R2[Variant Results] D3 --> R3[Trial Results] R1 --> AGG[Aggregated Results] R2 --> AGG R3 --> AGG ``` ## Example: Location-Based Trial Search ```mermaid sequenceDiagram participant User as User participant AI as AI Client participant MCP as BioMCP participant GEO as Geocoding Service participant CT as ClinicalTrials.gov User->>AI: Find active trials in Cleveland for NSCLC AI->>AI: Recognize location needs geocoding AI->>GEO: Geocode "Cleveland" GEO-->>AI: lat: 41.4993, long: -81.6944 AI->>MCP: search(domain="trial",<br/>diseases=["NSCLC"],<br/>lat=41.4993,<br/>long=-81.6944,<br/>distance=50) MCP->>CT: API call with geo filter CT-->>MCP: Trials near Cleveland MCP-->>AI: Formatted trial results AI-->>User: Here are X active NSCLC trials in Cleveland area ``` ## Key Features 1. **Parallel Execution**: Multiple domains are searched simultaneously for unified queries 2. **Smart Enrichment**: Article searches automatically include cBioPortal mutation summaries when genes are specified, providing clinical context alongside literature results 3. **Location Awareness**: Trial searches support geographic filtering with lat/long coordinates 4. **Sequential Thinking**: Built-in reasoning system for complex biomedical questions 5. **Standardized Output**: All results follow OpenAI MCP format for consistency ## Response Format All search results follow this standardized structure: ```json { "results": [ { "id": "PMID12345678", "title": "BRAF V600E mutation in melanoma", "text": "This study investigates BRAF mutations...", "url": "https://pubmed.ncbi.nlm.nih.gov/12345678" } ] } ``` Fetch results include additional domain-specific metadata in the response.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/genomoncology/biomcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server