Skip to main content
Glama
02-biothings-suite.md10.6 kB
# BioThings Suite API Reference The BioThings Suite provides unified access to biomedical annotations across genes, variants, diseases, and drugs through a consistent API interface. ## Usage Examples For practical examples using the BioThings APIs, see: - [How to Find Trials with NCI and BioThings](../how-to-guides/02-find-trials-with-nci-and-biothings.md#biothings-integration-for-enhanced-search) - [Get Comprehensive Variant Annotations](../how-to-guides/03-get-comprehensive-variant-annotations.md#integration-with-other-biomcp-tools) ## Overview BioMCP integrates with four BioThings APIs: - **MyGene.info**: Gene annotations and functional information - **MyVariant.info**: Genetic variant annotations and clinical significance - **MyDisease.info**: Disease ontology and terminology mappings - **MyChem.info**: Drug/chemical properties and mechanisms All APIs share: - RESTful JSON interface - No authentication required - Elasticsearch-based queries - Comprehensive data aggregation ## MyGene.info ### Base URL `https://mygene.info/v1/` ### Key Endpoints #### Gene Query ``` GET /query?q={query} ``` **Parameters:** - `q`: Query string (gene symbol, name, or ID) - `fields`: Specific fields to return - `species`: Limit to species (default: human, mouse, rat) - `size`: Number of results (default: 10) **Example:** ```bash curl "https://mygene.info/v1/query?q=BRAF&fields=symbol,name,summary,type_of_gene" ``` #### Gene Annotation ``` GET /gene/{geneid} ``` **Gene ID formats:** - Entrez Gene ID: `673` - Ensembl ID: `ENSG00000157764` - Gene Symbol: `BRAF` **Example:** ```bash curl "https://mygene.info/v1/gene/673?fields=symbol,name,summary,genomic_pos,pathway,go" ``` ### Important Fields | Field | Description | Example | | ------------- | ---------------------- | --------------------------------------- | | `symbol` | Official gene symbol | "BRAF" | | `name` | Full gene name | "B-Raf proto-oncogene" | | `entrezgene` | NCBI Entrez ID | 673 | | `summary` | Functional description | "This gene encodes..." | | `genomic_pos` | Chromosomal location | {"chr": "7", "start": 140433812} | | `pathway` | Pathway memberships | {"kegg": [...], "reactome": [...]} | | `go` | Gene Ontology terms | {"BP": [...], "MF": [...], "CC": [...]} | ## MyVariant.info ### Base URL `https://myvariant.info/v1/` ### Key Endpoints #### Variant Query ``` GET /query?q={query} ``` **Query syntax:** - Gene + variant: `dbnsfp.genename:BRAF AND dbnsfp.hgvsp:p.V600E` - rsID: `dbsnp.rsid:rs121913529` - Genomic: `_id:chr7:g.140453136A>T` **Example:** ```bash curl "https://myvariant.info/v1/query?q=dbnsfp.genename:TP53&fields=_id,clinvar,gnomad_exome" ``` #### Variant Annotation ``` GET /variant/{variant_id} ``` **ID formats:** - HGVS genomic: `chr7:g.140453136A>T` - dbSNP: `rs121913529` ### Important Fields | Field | Description | Example | | -------------- | ---------------------- | --------------------------------------- | | `clinvar` | Clinical significance | {"clinical_significance": "Pathogenic"} | | `dbsnp` | dbSNP annotations | {"rsid": "rs121913529"} | | `cadd` | CADD scores | {"phred": 35} | | `gnomad_exome` | Population frequency | {"af": {"af": 0.00001}} | | `dbnsfp` | Functional predictions | {"polyphen2": "probably_damaging"} | ### Query Filters ```python # Clinical significance q = "clinvar.clinical_significance:pathogenic" # Frequency filters q = "gnomad_exome.af.af:<0.01" # Rare variants # Gene-specific q = "dbnsfp.genename:BRCA1 AND cadd.phred:>20" ``` ## MyDisease.info ### Base URL `https://mydisease.info/v1/` ### Key Endpoints #### Disease Query ``` GET /query?q={query} ``` **Example:** ```bash curl "https://mydisease.info/v1/query?q=melanoma&fields=mondo,disease_ontology,synonyms" ``` #### Disease Annotation ``` GET /disease/{disease_id} ``` **ID formats:** - MONDO: `MONDO:0007254` - DOID: `DOID:1909` - OMIM: `OMIM:155600` ### Important Fields | Field | Description | Example | | ------------------ | ----------------- | -------------------------------------------- | | `mondo` | MONDO ontology | {"id": "MONDO:0007254", "label": "melanoma"} | | `disease_ontology` | Disease Ontology | {"id": "DOID:1909"} | | `synonyms` | Alternative names | ["malignant melanoma", "MM"] | | `xrefs` | Cross-references | {"omim": ["155600"], "mesh": ["D008545"]} | | `phenotypes` | HPO terms | [{"hpo_id": "HP:0002861"}] | ## MyChem.info ### Base URL `https://mychem.info/v1/` ### Key Endpoints #### Drug Query ``` GET /query?q={query} ``` **Example:** ```bash curl "https://mychem.info/v1/query?q=imatinib&fields=drugbank,chembl,chebi" ``` #### Drug Annotation ``` GET /drug/{drug_id} ``` **ID formats:** - DrugBank: `DB00619` - ChEMBL: `CHEMBL941` - Name: `imatinib` ### Important Fields | Field | Description | Example | | -------------- | -------------- | -------------------------------------------- | | `drugbank` | DrugBank data | {"id": "DB00619", "name": "Imatinib"} | | `chembl` | ChEMBL data | {"molecule_chembl_id": "CHEMBL941"} | | `chebi` | ChEBI ontology | {"id": "CHEBI:45783"} | | `drugcentral` | Indications | {"indications": [...]} | | `pharmacology` | Mechanism | {"mechanism_of_action": "BCR-ABL inhibitor"} | ## Common Query Patterns ### 1. Gene to Variant Pipeline ```python # Step 1: Get gene info gene_response = requests.get( "https://mygene.info/v1/gene/BRAF", params={"fields": "symbol,genomic_pos"} ) # Step 2: Find variants in gene variant_response = requests.get( "https://myvariant.info/v1/query", params={ "q": "dbnsfp.genename:BRAF", "fields": "clinvar.clinical_significance,gnomad_exome.af", "size": 100 } ) ``` ### 2. Disease Synonym Expansion ```python # Get all synonyms for a disease disease_response = requests.get( "https://mydisease.info/v1/query", params={ "q": "melanoma", "fields": "mondo,synonyms,xrefs" } ) # Extract all names all_names = ["melanoma"] for hit in disease_response.json()["hits"]: if "synonyms" in hit: all_names.extend(hit["synonyms"]) ``` ### 3. Drug Target Lookup ```python # Find drugs targeting a gene drug_response = requests.get( "https://mychem.info/v1/query", params={ "q": "drugcentral.targets.gene_symbol:BRAF", "fields": "drugbank.name,chembl.pref_name", "size": 50 } ) ``` ## Rate Limits and Best Practices ### Rate Limits - **Default**: 1,000 requests/hour per IP - **Batch queries**: Up to 1,000 IDs per request - **No authentication**: Public access ### Best Practices #### 1. Use Field Filtering ```python # Good - only request needed fields params = {"fields": "symbol,name,summary"} # Bad - returns all fields params = {} ``` #### 2. Batch Requests ```python # Good - single request for multiple genes response = requests.post( "https://mygene.info/v1/gene", json={"ids": ["BRAF", "KRAS", "EGFR"]} ) # Bad - multiple individual requests for gene in ["BRAF", "KRAS", "EGFR"]: requests.get(f"https://mygene.info/v1/gene/{gene}") ``` #### 3. Handle Missing Data ```python # Check for field existence if "clinvar" in variant and "clinical_significance" in variant["clinvar"]: significance = variant["clinvar"]["clinical_significance"] else: significance = "Not available" ``` ## Error Handling ### Common Errors #### 404 Not Found ```json { "success": false, "error": "ID not found" } ``` #### 400 Bad Request ```json { "success": false, "error": "Invalid query syntax" } ``` #### 429 Rate Limited ```json { "success": false, "error": "Rate limit exceeded" } ``` ### Error Handling Code ```python def query_biothings(api_url, query_params): try: response = requests.get(api_url, params=query_params) response.raise_for_status() return response.json() except requests.exceptions.HTTPError as e: if e.response.status_code == 404: return {"error": "Not found", "query": query_params} elif e.response.status_code == 429: # Implement exponential backoff time.sleep(60) return query_biothings(api_url, query_params) else: raise ``` ## Data Sources Each BioThings API aggregates data from multiple sources: ### MyGene.info Sources - NCBI Entrez Gene - Ensembl - UniProt - KEGG, Reactome, WikiPathways - Gene Ontology ### MyVariant.info Sources - dbSNP - ClinVar - gnomAD - CADD - PolyPhen-2, SIFT - COSMIC ### MyDisease.info Sources - MONDO - Disease Ontology - OMIM - MeSH - HPO ### MyChem.info Sources - DrugBank - ChEMBL - ChEBI - PubChem - DrugCentral ## Advanced Features ### Full-Text Search ```python # Search across all fields params = { "q": "lung cancer EGFR", # Searches all text fields "fields": "symbol,name,summary" } ``` ### Faceted Search ```python # Get aggregations params = { "q": "clinvar.clinical_significance:pathogenic", "facets": "dbnsfp.genename", "size": 0 # Only return facets } ``` ### Scrolling Large Results ```python # For results > 10,000 params = { "q": "dbnsfp.genename:TP53", "fetch_all": True, "fields": "_id" } ``` ## Integration Tips ### 1. Caching Strategy - Cache gene/drug/disease lookups (stable) - Don't cache variant queries (frequently updated) - Use ETags for conditional requests ### 2. Parallel Requests ```python import asyncio import aiohttp async def fetch_all(session, urls): tasks = [] for url in urls: tasks.append(session.get(url)) return await asyncio.gather(*tasks) ``` ### 3. Data Normalization ```python def normalize_gene_symbol(symbol): # Query MyGene to get official symbol response = requests.get( f"https://mygene.info/v1/query?q={symbol}" ) if response.json()["hits"]: return response.json()["hits"][0]["symbol"] return symbol ```

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/genomoncology/biomcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server