Skip to main content
Glama

BioContextAI Knowledgebase MCP

Official

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
PORTNoPort number on which to run the server when in PRODUCTION mode8000
MCP_ENVIRONMENTNoEnvironment setting for the server, can be set to PRODUCTION or DEVELOPMENT

Schema

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Tools

Functions exposed to the LLM to take actions

NameDescription
bc_get_uniprot_id_by_protein_symbol

Query the UniProt database for the UniProt ID using the protein name.

Args: protein_symbol (str): The name of the protein to search for (e.g., "SYNPO"). species (str): The organism ID (e.g., "9606" for human). Default is "9606".

Returns: str: The UniProt ID of the protein.

Raises: ValueError: If no results are found for the given protein name.

bc_get_uniprot_protein_info

Query the UniProt database for protein information.

Provide either protein_id or protein_name to search for a specific protein. Always provide the species parameter to ensure the correct protein is returned.

Args: protein_id (str, optional): The protein identifier or accession number (e.g., "P04637"). Only provide if protein_name is None. protein_name (str, optional): The name of the protein to search for (e.g., "P53"). gene_symbol (str, optional): The gene name to search for (e.g., "TP53"). species (str, optional): Taxonomy ID (e.g., 10090) as string. include_references (bool, optional): Whether to include references and cross-references in the response. Defaults to False.

Returns: dict: Protein data or error message

bc_get_alphafold_info_by_protein_symbol

Query the AlphaFold database for the protein structure information using the protein name.

This function constructs a query URL to fetch data from the AlphaFold database based on the provided protein name. The response contains links to the PDB and CIF files for the protein structure, as well as general information about the protein.

Args: protein_symbol (Annotated[str, Field, optional): The name of the protein to search for (e.g., "SYNPO"). species (str): The organism ID (e.g., "9606" for human). Default is "9606".

Returns: dict: Protein structure information or an error message.

bc_get_antibody_information

Get detailed information for a specific antibody by its ID.

This function retrieves comprehensive information about a single antibody from the Antibody Registry using its unique antibody ID (abId). The antibody ID is typically obtained from the results of get_antibody_list() function, where each antibody entry contains an 'abId' field that can be used with this function to get detailed information.

Note: Some information provided by the Antibody Registry is for non-commercial use only. Users should refer to antibodyregistry.org for complete terms of use and licensing details.

Args: ab_id (str): The unique antibody ID from the Antibody Registry. This is typically obtained from the 'abId' field in the results of get_antibody_list(), unless the ID is directly provided by the user.

Returns: dict: Detailed antibody information including catalog number, vendor, clonality, epitope, applications, target species, isotype, source organism, citations, and other metadata, or error message if the request fails.

bc_get_antibody_list

Query the Antibody Registry for available antibodies.

This function searches the Antibody Registry database for antibodies matching the search term. Common search parameters include gene symbols (e.g., 'TRPC6'), protein names, UniProt IDs, or other relevant identifiers.

Note: Some information provided by the Antibody Registry is for non-commercial use only. Users should refer to antibodyregistry.org for complete terms of use and licensing details.

Args: search (str): Search term for antibodies. Can be a gene symbol, protein name, UniProt ID, or similar identifier.

Returns: dict: Antibody search results including catalog numbers, vendor information, clonality, applications, and other antibody metadata, or error message if the request fails.

bc_get_biorxiv_preprint_details

Get detailed information about a specific preprint by DOI.

This tool retrieves detailed metadata for a single preprint from bioRxiv or medRxiv using its DOI identifier.

Args: doi (str): DOI of the preprint (e.g., '10.1101/2020.09.09.20191205'). server (str): Server to search - 'biorxiv' or 'medrxiv' (default: 'biorxiv').

Returns: dict: Detailed preprint information or error message

bc_get_recent_biorxiv_preprints

Get recent preprints from bioRxiv or medRxiv.

This tool searches the bioRxiv and medRxiv preprint servers for research papers. You can search by date range, recent posts, or most recent papers. Results are paginated with up to 100 papers per API call.

Args: server (str): Server to search - 'biorxiv' or 'medrxiv' (default: 'biorxiv'). start_date (str, optional): Start date in YYYY-MM-DD format. end_date (str, optional): End date in YYYY-MM-DD format. days (int, optional): Number of recent days to search (1-365). recent_count (int, optional): Number of most recent preprints (1-1000). category (str, optional): Subject category filter (e.g., 'cell biology', 'neuroscience'). cursor (int): Starting position for pagination (default: 0). max_results (int): Maximum number of results to return (default: 100, max: 500).

Returns: dict: Preprint search results or error message

bc_get_recruiting_studies_by_location

Find recruiting clinical trials in a specific geographic location.

This function helps patients and healthcare providers find clinical trials that are currently recruiting participants in their area.

Args: location_country (str): Country name where studies are conducted. location_state (str, optional): State or province name. location_city (str, optional): City name. condition (str, optional): Medical condition to filter by. study_type (str, optional): Type of study filter (default: "ALL"). age_range (str, optional): Age group filter (default: "ALL"). page_size (int): Number of results to return (default: 50, max: 1000).

Returns: dict: Recruiting studies in the specified location or error message

bc_get_studies_by_condition

Search for clinical trials by medical condition with simplified parameters.

This function provides a focused search for clinical trials related to a specific medical condition, with common filters that biomedical researchers typically use.

Args: condition (str): Medical condition or disease name to search for. status (str, optional): Study status filter (default: "ALL"). study_type (str, optional): Type of study filter (default: "ALL"). location_country (str, optional): Country where studies are conducted. page_size (int): Number of results to return (default: 50, max: 1000). sort (str): Sort order for results (default: most recently updated).

Returns: dict: Study search results with summary statistics or error message

bc_get_studies_by_intervention

Search for clinical trials by drug or intervention name.

This function helps biomedical researchers find clinical trials testing specific drugs, therapies, or treatments, with optional filters for condition and phase.

Args: intervention (str): Drug, therapy, or treatment name to search for. condition (str, optional): Medical condition to filter by. phase (str, optional): Clinical trial phase to filter by. status (str, optional): Study status filter (default: "ALL"). intervention_type (str, optional): Type of intervention filter (default: "ALL"). page_size (int): Number of results to return (default: 50, max: 1000). sort (str): Sort order for results (default: most recently updated).

Returns: dict: Study search results with summary statistics or error message

bc_get_study_details

Get detailed information about a specific clinical trial by its NCT ID.

This function retrieves comprehensive data about a single clinical trial, including study design, eligibility criteria, outcomes, locations, and contact information.

Args: nct_id (str): NCT ID of the clinical trial (e.g., "NCT01234567"). fields (str): Comma-separated list of fields to return, or "all" for complete data. Default includes key modules for biomedical researchers.

Returns: dict: Detailed study information or error message

bc_search_studies

Search for clinical trials studies based on various criteria.

This function allows biomedical researchers to find relevant clinical trials by searching across conditions, interventions, sponsors, and other study characteristics.

Args: condition (str, optional): Medical condition or disease to search for. intervention (str, optional): Drug, therapy, or treatment name to search for. sponsor (str, optional): Study sponsor organization. status (str, optional): Current status of the study. phase (str, optional): Clinical trial phase. study_type (str, optional): Type of study (interventional, observational, etc.). location_country (str, optional): Country where study is conducted. min_age (int, optional): Minimum age of participants in years. max_age (int, optional): Maximum age of participants in years. sex (str, optional): Sex of participants. page_size (int): Number of results to return (default: 25, max: 1000). sort (str): Sort order for results (default: most recently updated).

Returns: dict: Study search results or error message

bc_get_ensembl_id_from_gene_symbol

Query the Ensembl database for the Ensembl ID of a given gene name.

Always provide the species parameter to ensure the correct protein is returned.

Args: gene_symbol (str): The name of the gene to search for (e.g., "TP53"). species (str): Taxonomy ID (e.g., 10090) as string (default: "9606").

Returns: dict: Gene data or error message

bc_get_europepmc_articles

Query the Europe PMC database for scientific articles.

Use 'recent' sort for current research queries and 'cited' sort for comprehensive career overviews or well-established topics (e.g., "what has author X published on in their career").

Provide at least one of the following search parameters:

  • query: General search query string
  • title: Search term for article titles
  • abstract: Search term for article abstracts.
  • author: Author name (e.g., "last_name,first_name"). Should not contain spaces. These will be combined with the specified search type ("and" or "or"). For a broad search, prefer the "query" parameter and "or" search type. Only use the "and" search type if you want to ensure all terms must match.

Args: query (str, optional): General search query string. title (str, optional): Search term for article titles. abstract (str, optional): Search term for article abstracts. author (str, optional): Author name (e.g., "last_name,first_name"). Should not contain spaces. search_type (str): Search type - "and" or "or" (default: "or"). sort_by (str): Sort by - "recent" for most recent, "cited" for most cited or None for no specific sorting (default: None). page_size (int): Number of results to return (default: 25, max: 1000).

Returns: dict: Article search results or error message

bc_get_europepmc_fulltext

Get the full text XML for a given PMC ID from Europe PMC.

Args: pmc_id (str): PMC ID starting with "PMC" (e.g., "PMC11629965").

Returns: dict: Full text XML content or error message

bc_search_grants_gov

Search for grants from grants.gov using the Search2 API.

Args: keyword: Keyword to search for opp_num: Opportunity number eligibilities: Eligibility criteria (comma-separated) agencies: Agency codes (comma-separated) rows: Number of results to return opp_statuses: Opportunity statuses (pipe-separated, e.g. 'forecasted|posted') aln: Assistance Listing Number funding_categories: Funding categories (comma-separated)

Returns: dict: Search results from grants.gov or error message

bc_get_interpro_entry

Get detailed information about a specific InterPro entry.

InterPro entries represent protein families, domains, and functional sites. Each entry integrates information from multiple member databases.

Args: interpro_id (str): The InterPro entry identifier (e.g., "IPR000001"). include_interactions (bool, optional): Whether to include protein-protein interactions data. Defaults to False. include_pathways (bool, optional): Whether to include pathway information. Defaults to False. include_cross_references (bool, optional): Whether to include cross-references to other databases. Defaults to False.

Returns: dict: InterPro entry data including description, type, member databases, and optional additional data

bc_get_protein_domains

Get domain architecture and InterPro matches for a specific protein.

This function retrieves all InterPro domain matches for a given protein, providing insight into the protein's functional domains and architecture.

To get the protein's UniProt ID, use the get_uniprot_id_by_protein_symbol tool first.

Args: protein_id (str): The protein identifier or accession (e.g., "P04637" or "CYC_HUMAN"). source_db (str, optional): The protein database source. Defaults to "uniprot". include_structure_info (bool, optional): Whether to include structural information. Defaults to False. species_filter (str, optional): Taxonomy ID to filter results (e.g., "9606" for human). Defaults to None.

Returns: dict: Protein domain information including InterPro matches, domain architecture, and optional structural data

bc_search_interpro_entries

Search InterPro entries by various criteria.

This function allows searching the InterPro database using different filters such as entry type, source database, GO terms, and species.

Args: query (str, optional): Search term for InterPro entry names or descriptions. entry_type (str, optional): Filter by entry type (family, domain, etc.). source_database (str, optional): Filter by member database (pfam, prosite, etc.). go_term (str, optional): Filter by GO term (e.g., "GO:0006122"). species_filter (str, optional): Filter by taxonomy ID (e.g., "9606" for human). page_size (int, optional): Number of results to return (max 200). Defaults to 20.

Returns: dict: Search results with InterPro entries matching the criteria

bc_get_available_ontologies

Query the Ontology Lookup Service (OLS) for all available ontologies.

This function retrieves a list of all ontologies available in OLS, including their names, descriptions, and metadata. Use this function first to discover which ontologies are available before using other search functions.

Returns: dict: Dictionary containing available ontologies and their information or error message

bc_get_cell_ontology_terms

Query the Ontology Lookup Service (OLS) for Cell Ontology (CL) terms.

This function searches for Cell Ontology terms associated with cell types using the OLS API. The Cell Ontology provides a controlled vocabulary for cell types.

Args: cell_type (str): The cell type to search for (e.g., "T cell"). size (int): Maximum number of results to return (default: 10). exact_match (bool): Whether to perform an exact match search (default: False).

Returns: dict: Dictionary containing Cell Ontology terms and information or error message

bc_get_chebi_terms_by_chemical

Query the Ontology Lookup Service (OLS) for ChEBI terms related to a chemical name.

This function searches for ChEBI (Chemical Entities of Biological Interest) terms associated with a given chemical name using the OLS API.

Args: chemical_name (str): The chemical or drug name to search for (e.g., "aspirin"). size (int): Maximum number of results to return (default: 10). exact_match (bool): Whether to perform an exact match search (default: False).

Returns: dict: Dictionary containing ChEBI terms and information or error message

bc_get_efo_id_by_disease_name

Query the Ontology Lookup Service (OLS) for EFO/Mondo/HP IDs related to a disease name.

This function searches for EFO IDs associated with a given disease name using the OLS API. Always use this function if you need EFO IDs, e.g., for use in the Open Targets API.

Args: disease_name (str): The name of the disease to search for (e.g., "SIDS"). size (int): Maximum number of results to return (default: 5). exact_match (bool): Whether to perform an exact match search (default: False).

Returns: dict: Dictionary containing EFO IDs and information or error message

bc_get_go_terms_by_gene

Query the Ontology Lookup Service (OLS) for Gene Ontology (GO) terms related to a gene name.

This function searches for GO terms associated with a given gene name using the OLS API. Gene Ontology provides structured vocabularies for gene and gene product attributes.

Args: gene_name (str): The gene name or symbol to search for (e.g., "TP53"). size (int): Maximum number of results to return (default: 10). exact_match (bool): Whether to perform an exact match search (default: False).

Returns: dict: Dictionary containing GO terms and information or error message

bc_get_term_details

Query the Ontology Lookup Service (OLS) for detailed information about a specific term.

This function retrieves comprehensive information about a specific ontology term, including its definition, synonyms, hierarchical relationships, and cross-references.

Args: term_id (str): The term ID in CURIE format (e.g., "EFO:0000001"). ontology_id (str): The ontology ID (e.g., "efo").

Returns: dict: Dictionary containing detailed term information or error message

bc_get_term_hierarchical_children

Query the Ontology Lookup Service (OLS) for hierarchical children of a term.

This function retrieves the hierarchical children of a specific ontology term, including subclasses and terms related via hierarchical properties like 'part of'.

Args: term_id (str): The term ID in CURIE format (e.g., "EFO:0000001"). ontology_id (str): The ontology ID (e.g., "efo"). size (int): Maximum number of children to return (default: 20).

Returns: dict: Dictionary containing hierarchical children or error message

bc_search_ontology_terms

Query the Ontology Lookup Service (OLS) for terms across multiple ontologies.

This function provides a general search across ontologies in OLS, allowing you to find terms from multiple ontologies or search all ontologies at once.

TIP: Use get_available_ontologies() first to discover which ontologies are available and their IDs before searching.

Args: search_term (str): The term to search for. ontologies (str): Comma-separated ontology IDs (e.g., "efo,go,chebi"). Empty for all. Use get_available_ontologies() to see available options. size (int): Maximum number of results to return (default: 20). exact_match (bool): Whether to perform an exact match search (default: False).

Returns: dict: Dictionary containing terms from various ontologies or error message

bc_get_available_pharmacologic_classes

Get available pharmacologic classes from the FDA database.

This function retrieves the actual pharmacologic class values available in the FDA database, which can then be used with search_drugs_by_therapeutic_class. Always call this function first to see available options before searching.

Args: class_type (str): Type of classification - epc, moa, pe, or cs. limit (int): Maximum number of unique classes to return.

Returns: dict: Available pharmacologic class values in the FDA database.

bc_search_drugs_by_therapeutic_class

Search for drugs by their therapeutic or pharmacologic class.

IMPORTANT: Use get_available_pharmacologic_classes() first to see the exact class terms available in the FDA database. This function requires exact matches of the pharmacologic class terms as they appear in the FDA data.

Args: therapeutic_class (str): The exact therapeutic class term from FDA database. class_type (str): Type of classification - epc, moa, pe, or cs. limit (int): Maximum number of results to return.

Returns: dict: Search results for drugs in the specified therapeutic class.

bc_get_generic_equivalents

Find generic equivalents for a brand name drug.

This function searches for ANDA (Abbreviated New Drug Application) entries that are generic equivalents of a specified brand name drug.

Args: brand_name (str): The brand name drug to find generics for.

Returns: dict: Generic drug equivalents and their manufacturers.

bc_count_drugs_by_field

Count unique values in a specific field across FDA-approved drugs.

This function is useful for statistical analysis and getting overviews of the drug database. Common fields to count include:

  • sponsor_name: Count drugs by pharmaceutical company
  • products.dosage_form: Count by dosage forms (tablet, injection, etc.)
  • products.route: Count by administration routes (oral, injection, etc.)
  • products.marketing_status: Count by marketing status
  • openfda.pharm_class_epc: Count by pharmacologic class

Args: field (str): The field to count unique values for. search_filter (str, optional): Search filter to apply before counting. limit (int): Maximum number of count results to return.

Returns: dict: Count results showing terms and their frequencies.

bc_get_drug_statistics

Get general statistics about the FDA Drugs@FDA database.

This function provides an overview of the database including:

  • Top pharmaceutical sponsors by number of approved drugs
  • Most common dosage forms
  • Most common routes of administration
  • Distribution of marketing statuses

Returns: dict: Statistical overview of the FDA drugs database.

bc_get_drug_by_application_number

Get detailed information about a specific FDA-approved drug by its application number.

Application numbers follow the format: NDA, ANDA, or BLA followed by 6 digits.

  • NDA: New Drug Application (brand name drugs)
  • ANDA: Abbreviated New Drug Application (generic drugs)
  • BLA: Biologics License Application (biological products)

Args: application_number (str): The FDA application number.

Returns: dict: Detailed drug information from the FDA Drugs@FDA API.

bc_get_drug_label_info

Get drug labeling information including active ingredients, dosage, and usage instructions.

This function retrieves comprehensive drug label information from the FDA's drug labeling database, which includes detailed product information, active ingredients, dosage forms, and administration routes.

Args: brand_name (str, optional): Brand name of the drug. generic_name (str, optional): Generic name of the drug. ndc (str, optional): National Drug Code number.

Returns: dict: Drug labeling information from the FDA API.

bc_search_drugs_fda

Search the FDA Drugs@FDA database for approved drug products.

This function searches for FDA-approved drugs based on various criteria including brand names, generic names, active ingredients, sponsors, and regulatory information.

Args: brand_name (str, optional): Brand or trade name of the drug. generic_name (str, optional): Generic name of the drug. active_ingredient (str, optional): Active ingredient name. sponsor_name (str, optional): Company or sponsor name. application_number (str, optional): FDA application number (NDA, ANDA, or BLA). marketing_status (str, optional): Marketing status of the drug. dosage_form (str, optional): Dosage form of the drug. route (str, optional): Route of administration. search_type (str): How to combine search terms - "and" or "or". sort_by (str, optional): Field to sort results by. limit (int): Maximum number of results to return (1-1000). skip (int): Number of results to skip for pagination (0-25000).

Returns: dict: Search results from the FDA Drugs@FDA API.

bc_get_open_targets_graphql_schema

Fetch the Open Targets GraphQL schema.

bc_get_open_targets_query_examples

Get example GraphQL queries for the Open Targets API.

Returns a dictionary of named example queries that can be used with the query_open_targets_graphql tool. These examples demonstrate common use cases for retrieving data about targets, diseases, drugs, and their associations.

bc_query_open_targets_graphql

Execute a GraphQL query against the Open Targets API after fetching the schema.

Important: Always first fetch examples using the schema using get_open_targets_query_examples. If the examples are not sufficient, also get the schema using the get_open_targets_graphql_schema tool before executing a query. Relying on either of these options provides the necessary context for the query and ensures that the query is valid.

Queries should use the Ensembl gene ID (e.g., "ENSG00000141510"). If necessary, first use get_ensembl_id_from_gene_symbol to convert gene symbols (e.g., "TP53") to Ensembl IDs.

If a disease ID is needed, use the get_efo_id_from_disease_name tool to get the EFO ID (e.g., "EFO_0004705") for a disease name (e.g., "Hypothyroidism").

Make sure to always start the query string with the keyword query followed by the query name. The query string should be a valid GraphQL query, and the variables should be a dictionary of parameters that the query requires.

Open Targets provides data on:

  • target: annotations, tractability, mouse models, expression, disease/phenotype associations, available drugs.
  • disease: annotations, ontology, drugs, symptoms, target associations.
  • drug: annotations, mechanisms, indications, pharmacovigilance.
  • variant: annotations, frequencies, effects, consequences, credible sets.
  • studies: annotations, traits, publications, cohorts, credible sets.
  • credibleSet: annotations, variant sets, gene assignments, colocalization.
  • search: index of all platform entities.

Args: query_string (str): The GraphQL query string. variables (dict): The variables for the GraphQL query.

Returns: dict: The response data from the GraphQL API.

bc_get_panglaodb_marker_genes

Retrieves marker genes from the PanglaoDB dataset based on specified filters.

Args: species: The species ('Hs' for Human or 'Mm' for Mouse). min_sensitivity: Minimum sensitivity score (0-1). min_specificity: Minimum specificity score (0-1). organ: Filter by organ name (case-insensitive). cell_type: Filter by cell type name (case-insensitive). gene_symbol: Filter by gene symbol (case-insensitive).

Returns: A dictionary containing a list of matching marker gene records or an error message.

bc_get_panglaodb_options

Retrieves the available options for filtering marker genes in the PanglaoDB dataset.

Returns: A dictionary containing lists of unique values for species, organ, cell type, and gene symbols.

bc_get_pride_project

Get detailed information about a specific PRIDE project.

PRIDE (PRoteomics IDEntifications) is a public repository for mass spectrometry proteomics data. This function retrieves comprehensive information about a specific project including metadata, experimental details, and optionally associated files and similar projects.

Args: project_accession (str): The PRIDE project accession (e.g., "PRD000001"). include_files (bool, optional): Whether to include file information. Defaults to False. include_similar_projects (bool, optional): Whether to include similar projects. Defaults to False.

Returns: dict: Project information including metadata, experimental details, and optional file/similar project data

bc_search_pride_projects

Search PRIDE Archive projects by various criteria.

This function searches the PRIDE database for mass spectrometry proteomics projects using keywords and filters. Useful for finding relevant datasets for comparative analysis or method validation.

Args: keyword (str, optional): Search keywords for project titles/descriptions. organism_filter (str, optional): Filter by organism name. instrument_filter (str, optional): Filter by mass spectrometer instrument. experiment_type_filter (str, optional): Filter by experimental approach. page_size (int, optional): Number of results (max 100). Defaults to 20. sort_field (str, optional): Sort field. Defaults to "submissionDate". sort_direction (str, optional): Sort direction. Defaults to "DESC".

Returns: dict: Search results with matching PRIDE projects and metadata

bc_search_pride_proteins

Search proteins identified in a specific PRIDE project.

This function searches for proteins identified in a specific PRIDE mass spectrometry project. Useful for finding specific proteins of interest in proteomics datasets.

Args: project_accession (str): The PRIDE project accession to search in. keyword (str, optional): Search keyword for protein names or accessions. page_size (int, optional): Number of results (max 100). Defaults to 20. sort_field (str, optional): Sort field. Defaults to "accession". sort_direction (str, optional): Sort direction. Defaults to "ASC".

Returns: dict: Search results with proteins found in the specified project

bc_get_human_protein_atlas_info

Query the Human Protein Atlas API for target general information, genetic constraint, and tractability.

bc_get_reactome_info_by_identifier

Query the Reactome API identifier endpoint.

Use this endpoint to retrieve pathways associated with a given identifier. Always provide the species parameter to ensure the correct protein is returned.

Args: identifier (str): The identifier of the element to be retrieved base_url (str): Base URL for the Reactome API interactors (bool): Include interactors species (str or list): List of species to filter the result (accepts taxonomy ids, species names and dbId) page_size (int): Pathways per page page (int): Page number sort_by (str): Field to sort results by (e.g., "ENTITIES_PVALUE", "ENTITIES_FDR") order (str): Sort order ("ASC" or "DESC") resource (str): Resource to filter by (TOTAL includes all molecule types) p_value (float): P-value threshold (only pathways with p-value <= threshold will be returned) include_disease (bool): Set to False to exclude disease pathways min_entities (int): Minimum number of contained entities per pathway max_entities (int): Maximum number of contained entities per pathway importable_only (bool): Filter to only include importable resources timeout (int): Request timeout in seconds

Returns: dict: API response data or error information

bc_get_string_id

Map a protein identifier to STRING database IDs.

This function helps resolve common gene names, synonyms, or UniProt identifiers to the STRING-specific identifiers. Using STRING IDs in subsequent API calls improves reliability and performance.

Args: protein_symbol (str): The name of the protein to search for (e.g., "TP53"). species (str): The species taxonomy ID (e.g., "9606" for human). Optional. return_field (str): The field to return. Either stringId or preferredName (default: stringId). limit (int): Limit the number of matches returned per query (default: 1).

Returns: str: The STRING ID or preferred name if found, otherwise an error message.

bc_get_string_interactions

Get all protein-protein interactions for a given protein with a combined score above the threshold.

Always provide the species parameter to ensure the correct protein is returned.

Args: protein_symbol (str): The name of the protein to search for (e.g., "TP53"). species (str): The species taxonomy ID (e.g., "10090" for mouse). min_score (int): Minimum combined score threshold (default: 700).

Returns: list: A list of dictionaries containing interacting proteins and their scores.

bc_get_string_network_image

Get a network image for a given protein from the STRING database.

Always provide the species parameter to ensure the correct protein is returned.

Args: protein_symbol (str): The name of the protein to search for (e.g., "TP53"). species (str): The species taxonomy ID (e.g., "10090" for mouse). flavor (str): The network flavor to use (default: "confidence"). min_score (int): Minimum combined score threshold (default: 700).

Returns: Image: The network image for the protein.

bc_get_string_similarity_scores

Get similarity scores between proteins from the STRING database.

The scores represent protein homology based on Smith-Waterman bit scores. Only scores above 50 are reported, and only half of the similarity matrix (since it's symmetric) plus self-hits are returned.

Args: protein_symbol (str): The protein symbol of the first protein (e.g., "TP53"). protein_symbol_comparison (str): The protein symbol of the second protein (e.g., "MKI67"). species (str): The species taxonomy ID (e.g., "9606" for human). Optional.

Returns: list: A list of dictionaries containing protein pairs and their bit scores.

bc_get_kegg_id_by_gene_symbol

Get KEGG ID by gene symbol.

This function converts a gene symbol (like TP53) to a KEGG gene ID (like hsa:7157) for use in the KEGG API. The KEGG API typically requires KEGG IDs rather than gene symbols for most operations.

This is often the first step in a workflow - get the KEGG ID, then use it in subsequent API calls.

Common organism codes:

  • Human: 9606 (KEGG code: hsa)
  • Mouse: 10090 (KEGG code: mmu)
  • Rat: 10116 (KEGG code: rno)
  • E. coli: 562 (KEGG code: eco)
  • Yeast: 4932 (KEGG code: sce)

Args: gene_symbol (str): The gene symbol to search for (e.g., "TP53" for human, "Trp53" for mouse). organism_code (str): The organism code as taxonomy ID (e.g., "9606" for human, "10090" for mouse).

Returns: str | dict: The KEGG ID (e.g., "hsa:7157") or an error message.

Examples: >>> get_kegg_id_by_gene_symbol(gene_symbol="TP53", organism_code="9606") "hsa:7157"

>>> get_kegg_id_by_gene_symbol(gene_symbol="Trp53", organism_code="10090") "mmu:22059"
bc_query_kegg

Execute a KEGG API query.

This function provides access to the KEGG API, allowing you to query biological data across pathways, genes, compounds, diseases, and more. The function can perform all KEGG API operations and accepts various parameters depending on the operation.

When searching for genes in KEGG, you typically need KEGG IDs rather than gene symbols. Use the get_kegg_id_by_gene_symbol function first to convert gene symbols to KEGG IDs.

Common operations:

  • info: Get database metadata (e.g., operation=info, database=PATHWAY)
  • list: List entries in a database (e.g., operation=list, database=PATHWAY, query="hsa")
  • get: Retrieve specific entries (e.g., operation=get, entries=["hsa:7157"])
  • find: Search for entries by keyword (e.g., operation=find, database=COMPOUND, query="glucose")
  • link: Find related entries (e.g., operation=link, target_db=PATHWAY, entries=["hsa:7157"])
  • conv: Convert between identifiers (e.g., operation=conv, target_db=NCBI_GENEID, entries=["hsa:7157"])

Args: operation (KeggOperation): The KEGG operation to perform. database (KeggDatabase | KeggOutsideDb | str, optional): The database to query. target_db (KeggDatabase | KeggOutsideDb | str, optional): The target database for conversion. source_db (KeggDatabase | KeggOutsideDb | str, optional): The source database for conversion. query (str, optional): The query string for FIND or LIST operations. option (KeggOption | KeggFindOption | KeggRdfFormat, optional): Additional options for the operation. entries (List[str], optional): List of entries for GET or LINK operations.

Returns: str | dict: The result of the KEGG query or an error message.

Examples: # List human pathways >>> query_kegg(operation=KeggOperation.LIST, database=KeggDatabase.PATHWAY, query="hsa")

# Get data for the glycolysis pathway >>> query_kegg(operation=KeggOperation.GET, entries=["hsa00010"]) # Get data for the TP53 gene >>> query_kegg(operation=KeggOperation.GET, entries=["hsa:7157"]) # Get amino acid sequence for TP53 >>> query_kegg(operation=KeggOperation.GET, entries=["hsa:7157"], option=KeggOption.AASEQ) # Find compounds with formula C7H10O5 >>> query_kegg(operation=KeggOperation.FIND, database=KeggDatabase.COMPOUND, query="C7H10O5", option="formula") # Find pathways related to TP53 >>> query_kegg(operation=KeggOperation.LINK, target_db=KeggDatabase.PATHWAY, entries=["hsa:7157"]) # Convert KEGG ID to NCBI Gene ID >>> query_kegg(operation=KeggOperation.CONV, target_db="ncbi-geneid", source_db="hsa:7157") # Get information about a specific pathway >>> query_kegg(operation=KeggOperation.GET, entries=["hsa00010"]) # Get the compound ID for caffeine >>> query_kegg(operation=KeggOperation.FIND, database=KeggDatabase.COMPOUND, query="caffeine") # Get the drug ID for acetaminophen >>> query_kegg(operation=KeggOperation.FIND, database=KeggDatabase.DRUG, query="acetaminophen") # Check if two drugs interact (ibuprofen and aspirin) >>> query_kegg(operation=KeggOperation.DDI, entries=["dr:D00126", "dr:D00109"])
bc_search_google_scholar_publications

Search for publications on Google Scholar.

Supports advanced search operators including author search using 'author:"Name"' syntax.

Examples:

  • 'machine learning' - General topic search
  • 'author:"John Smith"' - Publications by specific author
  • 'author:"John Smith" neural networks' - Author's work on specific topic

WARNING: Google Scholar may block requests and IP addresses for excessive queries. Publication searches are particularly prone to triggering anti-bot measures. This tool automatically uses free proxies to mitigate blocking, but use responsibly.

For academic research, consider using alternative databases like PubMed/EuropePMC when possible to reduce load on Google Scholar.

Args: query (str): Search query for publications. Use 'author:"Name"' to search by author. max_results (int): Maximum number of publications to return (default: 10, max: 50). use_proxy (bool): Whether to use free proxies to avoid rate limiting (default: True).

Returns: dict: Publication search results or error message

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/biocontext-ai/knowledgebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server