TogoMCP
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| NCBI_API_KEY | Yes | Your NCBI API key (obtain from https://www.ncbi.nlm.nih.gov/datasets/docs/v2/api/api-keys/). Required for NCBI tools. | |
| TOGOMCP_QUERY_LOG | No | Optional file path to enable query logging. Example: /var/log/togomcp/togomcp.jsonl |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| extensions | {
"io.modelcontextprotocol/ui": {}
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| TogoMCP_Usage_GuideA | ⚠️ CALL THIS TOOL FIRST every turn, before any other TogoMCP tool. Returns the v4 Usage Guide, which enforces the empirically-validated workflow: Why this matters (measured): questions with ≥3 consecutive run_sparql calls score 1.26 points lower than compliant ones; jumping to text search before reading the MIE schema accounts for ~95% of silent SPARQL failures. The guide also documents the controlled category taxonomy used by find_databases() and the EXPLORATION habits (Seed Definition, concierge check, prioritized Next Steps) for open-ended deep dives. Re-run GATE 0 every turn — prior workflow does not carry forward. Returns: str: The content of the TogoMCP v4 usage guide. |
| get_sparql_endpointsA | Get the available SPARQL endpoints for RDF Portal. Returns: Dict with two keys: - databases: Dict mapping database -> {url, endpoint_name, keyword_search} - endpoints: Dict mapping endpoint_name -> {url, databases} |
| run_sparqlA | Run a SPARQL query on an RDF database. Specify database (valid values: uniprot, rhea, pubchem, pdb, chembl, chebi, reactome, ensembl, amrportal, mesh, go, taxonomy, mondo, nando, bacdive, mediadive, clinvar, pubmed, pubtator, ncbigene, medgen, ddbj, glycosmos, supercon, bgee, oma, brenda, hgnc, jpostdb) for single-database queries, or endpoint_name (valid values: sib, pubchem, pdb, ebi, primary, ncbi, ddbj, glycosmos, nims) / endpoint_url for cross-database queries on shared endpoints. Invalid database/endpoint_name values fail immediately with a deterministic error — do not retry. |
| get_graph_listA | Get a list of named graphs on a SPARQL endpoint. Virtuoso/OpenLink internal graphs are filtered out. If |
| get_MIE_fileA | At the start of any task, identify ALL databases needed and call this tool for EACH of them before writing any SPARQL queries. Do not query a database until its MIE file has been read. Get the MIE (Metadata Interoperability Exchange) file containing the ShEx schema, RDF and SPARQL examples of a specific RDF database. |
| list_databasesA | Supplementary: full catalog dump (browse only, no filtering). Returns every available RDF database with For every normal workflow, call Returns:
A list of dicts with keys |
| find_databasesA | Database discovery — REQUIRED first step for any TogoMCP workflow. Always call this BEFORE Workflow:
Common keywords to try: "MANE" (Ensembl), "drug targets" (ChEMBL), "clinical variants" (ClinVar), "pathways" (Reactome), "variants" (gnomAD), "ortholog" (OMA), "expression" (Bgee). If you have no search terms and want to browse the full catalog instead, see
Returns:
List of dicts: |
| list_categoriesA | Coarse-grained index of database categories with member database names. Use this when you don't yet have specific keywords — drill down with
Returns: Dict mapping category name -> sorted list of database names. Returns an empty dict if no databases have been annotated with categories yet. |
| search_uniprot_entityA | Search for a UniProt entity ID by query. ⚠️ Only the search string and The search string can be passed as any of: Args: query (str): The Solr-style query string for the UniProtKB /search endpoint. Returns: str: TSV-formatted results with columns: accession, protein_name, organism_name. |
| search_chembl_id_lookupB | Search for ChEMBL ID by query. The search string can be passed as any of: Returns: str: A JSON-formatted string containing the search results. |
| search_chembl_targetA | Search for a biological TARGET (protein/receptor/enzyme) in ChEMBL. ⚠️ DO NOT use this tool to look up drugs, compounds, or molecules by name.
For drug/compound/molecule names (e.g., "sorafenib", "imatinib", "aspirin"),
use This tool searches for biological entities that drugs act upon — proteins, protein complexes, nucleic acids, organisms, tissues, and cell lines. "Target" here means drug target, NOT "the thing I am looking up". Only the search string and Args: query (str): Search query string referring to a biological target. Examples: - Target name (e.g., "Thrombin", "EGFR", "Dopamine receptor") - Gene name (e.g., "BRCA1", "TP53") - UniProt accession (e.g., "P00734") - Organism name (e.g., "Homo sapiens") limit (int, optional): Maximum number of results to return. Defaults to 20. Returns: dict: Dictionary containing: - 'total_count' (int): Total number of matching targets found - 'results' (list): List of target dictionaries, each containing: - 'chembl_id' (str): ChEMBL target identifier (e.g., "CHEMBL1824") - 'name' (str): Preferred target name - 'organism' (str): Organism name (e.g., "Homo sapiens") - 'type' (str): Target type (e.g., "SINGLE PROTEIN", "PROTEIN COMPLEX") - 'score' (float): Relevance score for the search query Example: >>> results = await search_chembl_target("EGFR human", limit=5) >>> print(f"Found {results['total_count']} targets") >>> for target in results['results']: ... print(f"{target['chembl_id']}: {target['name']} ({target['organism']})") Target Types: - SINGLE PROTEIN: Individual protein target - PROTEIN COMPLEX: Multi-protein complex - PROTEIN FAMILY: Group of related proteins - NUCLEIC-ACID: DNA/RNA targets - TISSUE: Tissue-level target - CELL-LINE: Cell line target - ORGANISM: Whole organism target Raises: httpx.HTTPError: If the API request fails |
| search_chembl_moleculeA | Search for a DRUG / COMPOUND / MOLECULE by name or structure in ChEMBL. ✅ Use this tool for drug, compound, or molecule names
(e.g., "sorafenib", "imatinib", "aspirin", "Gleevec").
⚠️ For biological targets (proteins, receptors, enzymes, genes such as
EGFR, BRCA1, TP53), use Molecules in ChEMBL are small-molecule drugs, drug candidates, and bioactive compounds — including approved drugs, clinical candidates, and research compounds. Only the search string and Args: query (str): Search query string referring to a drug or compound. Examples: - Generic or brand drug name (e.g., "Aspirin", "Gleevec", "Paracetamol") - Research compound name - Synonyms or alternative names - SMILES notation (chemical structure string) - InChI or InChI Key limit (int, optional): Maximum number of results to return. Defaults to 20. Returns: dict: Dictionary containing: - 'total_count' (int): Total number of matching molecules found - 'results' (list): List of molecule dictionaries, each containing: - 'chembl_id' (str): ChEMBL molecule identifier (e.g., "CHEMBL25") - 'name' (str): Preferred molecule name (may be None for some compounds) - 'score' (float): Relevance score for the search query Example: >>> results = await search_chembl_molecule("aspirin", limit=5) >>> print(f"Found {results['total_count']} molecules") >>> for molecule in results['results']: ... print(f"{molecule['chembl_id']}: {molecule['name']} (score: {molecule['score']})") Use Cases: - Finding ChEMBL IDs for known drugs or compounds - Discovering molecules with similar names - Searching for bioactive compounds by structure (using SMILES/InChI) - Identifying research compounds and clinical candidates Note: - Some molecules may not have a preferred name and 'name' field will be None - Higher scores indicate better matches to the query - For structure-based searches, use SMILES or InChI notation Raises: httpx.HTTPError: If the API request fails |
| get_pubchem_compound_idA | Get a PubChem compound ID Args: Compound name example: "resveratrol" Returns: PubChem Compound ID in the JSON format |
| get_compound_attributes_from_pubchemC | Get compound attributes from PubChem RDF Args: PubChem Compound ID example: "445154" Returns: Compound attributes in the JSON format |
| search_pdb_entityA | Search for PDBj entry information by keywords. Args:
db (str): The database to search in. Allowed values are:
- "pdb" (Protein Data Bank, protein structures)
- "cc" (Chemical Component Dictionary, chemical components or small molecules in PDB)
- "prd" (BIRD, Biologically Interesting Reference Molecule Dictionary, mostly peptides).
query (str): Query string, any keywords that can be used to search for PDB entries.
Accepts aliases: Note: The PDBj search hits multiple fields (title, authors, keywords, citation metadata), not just the title. An entry can appear even if its title does not contain the query. Always verify relevance against the returned name/title before relying on a hit. Returns: str: A JSON-formatted string containing the search results. |
| search_mesh_descriptorA | Search for MeSH ID by query. Args:
query (str): The query string to search for. Accepts aliases:
Returns: str: A JSON-formatted string containing the search results. |
| search_reactome_entityA | Search the Reactome knowledgebase using keyword search. Args:
query: The search query string (e.g., "apoptosis", "TP53", "cell cycle").
Accepts aliases: Returns: List of results with 'id', 'name', and 'type' fields. Example: [ {'id': 'R-HSA-109581', 'name': 'Apoptosis', 'type': 'Pathway'}, {'id': 'R-HSA-204981', 'name': '14-3-3epsilon...', 'type': 'Reaction'} ] Example: >>> results = search_reactome("apoptosis", rows=5) >>> for entry in results: ... print(f"{entry['type']:10} {entry['id']}: {entry['name']}") Raises: httpx.HTTPError: If the API request fails. |
| search_rhea_entityA | Search Rhea database for biochemical reactions using keyword search. Args:
query (str): Search query string. Examples:
- "ATP" - find reactions involving ATP
- "glucose" - find reactions with glucose
- "uniprot:*" - reactions with UniProt annotations
- "" - retrieve all reactions
Accepts aliases: Returns: List[Dict[str, str]]: List of reactions, each containing: - 'rhea_id': Reaction identifier (e.g., "RHEA:10000") - 'equation': Reaction equation text Example: >>> results = search_rhea_entity("ATP", limit=5) >>> for reaction in results: ... print(f"{reaction['rhea_id']}: {reaction['equation']}") |
| togoid_getAllRelationA | Discover all available ID conversion routes between databases. ⚡ PLANNING TOOL — Call this EARLY when a question involves 2+ databases that are on DIFFERENT SPARQL endpoints and you need to map IDs between them. Returns a map of all source→target database pairs that TogoID can convert. Use this to plan your cross-database strategy BEFORE attempting SPARQL joins or manual ID lookups. Common conversion routes include: - ncbigene ↔ uniprot (Gene IDs to/from protein accessions) - uniprot ↔ pdb (Protein accessions to/from 3D structure IDs) - ncbigene ↔ ensembl_gene (NCBI Gene to/from Ensembl gene IDs) - chembl_target ↔ uniprot (Drug targets to/from protein accessions) - ncbigene ↔ hgnc (Gene IDs to/from HGNC symbols) - pubchem_compound ↔ chembl_compound (Compound IDs across databases) When to use: - Question references 2+ databases on different SPARQL endpoints - You need to bridge identifiers (e.g., "find UniProt proteins for these NCBI Gene IDs") - Before writing complex multi-step SPARQL to join databases manually When NOT to use: - Both databases share a SPARQL endpoint (use a single SPARQL query) - You only need data from one database - NCBI esearch can already cross-reference what you need Returns: Dictionary mapping database pairs to their relationship metadata. Each entry shows source, target, and the nature of the link. |
| togoid_getRelationA | Check if a specific ID conversion route exists and get its details. Use this to verify that a particular source→target conversion is available before calling convertId. Also reveals the nature of the relationship (e.g., "encoded by", "has structure", "is target of"). Args: source: Source database key (e.g., 'uniprot', 'ncbigene', 'chembl_target') target: Target database key (e.g., 'pdb', 'ensembl_gene', 'hgnc') Returns: List of relationship objects with: - forward: relationship label from source to target - reverse: relationship label from target to source - description: explanation of the link Example: >>> getRelation('ncbigene', 'uniprot') # Shows: ncbigene → uniprot via "encoded by" relationship |
| togoid_getAllDatasetA | List all databases registered in TogoID with their ID formats. Returns configuration for every dataset TogoID knows about, including:
Useful for: - Discovering which databases are available for ID conversion - Checking the expected ID format (e.g., UniProt accession vs entry name) - Finding example IDs to test with countId before bulk conversion Returns: Dictionary mapping dataset keys (e.g., 'uniprot', 'ncbigene', 'pdb') to their configuration objects. |
| togoid_getDatasetA | Get configuration for a specific database in TogoID. Retrieves detailed metadata about a single dataset, including its ID format, URI prefix, example IDs, and available annotations. Args: dataset: Dataset key (e.g., 'uniprot', 'ncbigene', 'pdb', 'chembl_target', 'ensembl_gene', 'hgnc', 'pubchem_compound') Returns: Dictionary with: - label: Human-readable name - regex: ID validation pattern (use to verify your IDs are correctly formatted) - prefix: URI prefixes for linking - examples: Sample IDs (use with countId to test before bulk conversion) - annotations: Available annotation types for this dataset |
| togoid_getDescriptionA | Get human-readable descriptions for all databases in TogoID. Returns names, descriptions (in English and Japanese), and organization info for each registered database. Useful for understanding what each database contains when planning cross-database queries. Returns: Dictionary keyed by dataset name with description metadata. |
| togoid_convertIdA | Convert identifiers from one database to another. Maps IDs between biological databases — e.g., NCBI Gene IDs to UniProt accessions, or UniProt accessions to PDB structure IDs. IMPORTANT WORKFLOW: 1. First call getAllRelation() or getRelation() to verify the conversion route exists 2. Optionally call countId() to check how many IDs will convert 3. Then call convertId() with your IDs Args: ids: Source IDs. Accepts either a list of strings (e.g., ["672", "675", "7157"]) or a comma-separated string ("672,675,7157"). Examples: "672,675,7157" (NCBI Gene IDs), "P38398,P04637" (UniProt) route: Comma-separated pair of dataset keys: 'source,target'. Examples: - 'ncbigene,uniprot' (Gene → Protein) - 'uniprot,pdb' (Protein → 3D Structure) - 'ncbigene,ensembl_gene' (NCBI Gene → Ensembl Gene) - 'chembl_target,uniprot' (Drug Target → Protein) - 'uniprot,chembl_target' (Protein → Drug Target) - 'ncbigene,hgnc' (Gene → HGNC symbol) Multi-hop routes are also supported: - 'ncbigene,uniprot,pdb' (Gene → Protein → Structure) limit: Maximum number of results (default 10000) offset: Pagination offset for large result sets Returns: List of [source_id, target_id] pairs. Example: [["672", "P38398"], ["675", "O15129"], ...] Common use cases: - Bridging databases on different SPARQL endpoints - Mapping gene IDs to protein accessions for UniProt SPARQL queries - Finding PDB structures for a set of proteins - Identifying ChEMBL drug targets for a list of genes |
| togoid_countIdA | Check how many of your IDs can be converted before doing bulk conversion. A lightweight pre-check: tells you how many source IDs have mappings in the target database WITHOUT actually returning the mapped IDs. Use this to: - Verify your IDs are in the correct format - Estimate result size before a large convertId call - Check if a conversion route works for your specific IDs Args: source: Source database key (e.g., 'ncbigene', 'uniprot') target: Target database key (e.g., 'uniprot', 'pdb') ids: Source IDs to check. Accepts either a list of strings or a comma-separated string (e.g., ["672", "675"] or "672,675"). Returns: Dictionary with: - source count: number of input IDs recognized - target count: number of target IDs found Example: >>> countId('ncbigene', 'uniprot', '672,675,7157') # Returns: {"source": 3, "target": 5} # (3 genes map to 5 UniProt entries — some genes have multiple proteins) |
| ncbi_esearchA | Search NCBI databases using E-utilities esearch API. ⚠️ CRITICAL FOR COMPREHENSIVE RESULTS ⚠️ ALWAYS use NCBI field tags for Gene, ClinVar, and similar databases! Without field tags, you may miss 70-80% of relevant results. MANDATORY FIELD TAGS FOR GENE DATABASE: • [Organism] - Taxonomic filtering (e.g., "Homo sapiens[Organism]", "Archaea[Organism]") • [Gene Name] - Gene symbols (e.g., "TP53[Gene Name]", "nifH[Gene Name]") • [All Fields] - Broad keyword search (e.g., "nitrogenase[All Fields]") IMPACT OF FIELD TAGS (Gene Database): • Without field tags: ~300 results (20-30% recall) ❌ • With field tags: ~1,300 results (100% recall) ✅ • Performance loss: Missing field tags = 70-80% data loss! Args:
database: NCBI database name (alias: Returns: Formatted search results with database-specific IDs Examples - GENE DATABASE (CRITICAL): ✅ CORRECT (finds ~1,300 archaeal nifH genes, 100% recall): database="gene" query="Archaea[Organism] AND (nifH[Gene Name] OR nitrogenase[All Fields])" Examples - OTHER DATABASES: MeSH: database="mesh", query="asthma[MeSH Terms]" PubMed: database="pubmed", query="CRISPR[Title/Abstract] AND gene editing" Taxonomy: database="taxonomy", query="Escherichia coli[Scientific Name]" ClinVar: database="clinvar", query="BRCA1[Gene Name] AND pathogenic[Clinical Significance]" PubChem: database="pccompound", query="aspirin[All Fields]" Learn more: https://www.ncbi.nlm.nih.gov/books/NBK3837/ |
| ncbi_list_databasesA | List all supported NCBI databases with descriptions and example queries. Returns: Formatted list of available databases |
| ncbi_esummaryA | Fetch summary information for given IDs using esummary. Useful for getting detailed info after esearch. Args:
database: NCBI database name (alias: Returns: Parsed JSON response with summary data |
| ncbi_efetchA | Fetch full records using efetch. Returns actual data (sequences, records, etc.) Args:
database: NCBI database name (alias: Returns: Response text in requested format |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/dbcls/togomcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server