Skip to main content
Glama
biocontext-ai

BioContextAI Knowledgebase MCP

Official

bc_get_uniprot_protein_info

Retrieve protein details from UniProt database using protein ID, name, or gene symbol to access accession, sequence, functions, and organism information.

Instructions

Retrieve protein information from UniProt database. Provide at least one of protein_id, protein_name, or gene_symbol.

Returns: dict: Protein information with accession, proteinDescription, genes, organism, sequence, functions, keywords, references or error message.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
protein_idNoProtein accession number (e.g., 'P04637')
protein_nameNoProtein name to search for (e.g., 'P53')
gene_symbolNoGene symbol to search for (e.g., 'TP53')
speciesNoTaxonomy ID (e.g., '10090') or species name
include_referencesNoInclude references and cross-references in response

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The handler function `get_uniprot_protein_info` for the tool, decorated with `@core_mcp.tool()`, which queries the UniProt REST API to retrieve protein information based on ID, name, gene symbol, species, and includes references optionally.
    @core_mcp.tool()
    def get_uniprot_protein_info(
        protein_id: Annotated[
            Optional[str],
            Field(description="Protein accession number (e.g., 'P04637')"),
        ] = None,
        protein_name: Annotated[
            Optional[str],
            Field(description="Protein name to search for (e.g., 'P53')"),
        ] = None,
        gene_symbol: Annotated[
            Optional[str],
            Field(description="Gene symbol to search for (e.g., 'TP53')"),
        ] = None,
        species: Annotated[
            Optional[str],
            Field(description="Taxonomy ID (e.g., '10090') or species name"),
        ] = None,
        include_references: Annotated[
            bool,
            Field(description="Include references and cross-references in response"),
        ] = False,
    ) -> dict:
        """Retrieve protein information from UniProt database. Provide at least one of protein_id, protein_name, or gene_symbol.
    
        Returns:
            dict: Protein information with accession, proteinDescription, genes, organism, sequence, functions, keywords, references or error message.
        """
        base_url = "https://rest.uniprot.org/uniprotkb/search"
    
        # Ensure at least one search parameter was provided
        if not protein_id and not protein_name and not gene_symbol:
            return {"error": "At least one of protein_id or protein_name or gene_symbol must be provided."}
    
        query_parts = []
    
        if protein_id:
            query_parts.append(f"accession:{protein_id}")
    
        elif protein_name:
            query_parts.append(f"protein_name:{protein_name}")
    
        elif gene_symbol:
            query_parts.append(f"gene:{gene_symbol}")
    
        if species:
            species = str(species).strip()
    
            # Try to determine if it's a taxonomy ID (numeric) or a name
            if species.isdigit():
                query_parts.append(f"organism_id:{species}")
            else:
                query_parts.append(f'taxonomy_name:"{species}"')
    
        query = " AND ".join(query_parts)
    
        params: dict[str, str | int] = {
            "query": query,
            "format": "json",
        }
    
        try:
            response = requests.get(base_url, params=params)
            response.raise_for_status()
    
            result = response.json()
            if not result.get("results"):
                return {"error": "No results found for the given query."}
    
            first_result = result["results"][0]
    
            # Remove references and cross-references by default to reduce response size
            if not include_references:
                first_result.pop("references", None)
                first_result.pop("uniProtKBCrossReferences", None)
    
            return first_result
        except Exception as e:
            return {"error": f"Exception occurred: {e!s}"}
  • Registration of the `core_mcp` server (containing the uniprot tool) into the main `mcp_app` with namespace prefix `slugify('BC') = 'bc'`, resulting in the tool name `bc_get_uniprot_protein_info`.
    for mcp in [core_mcp, *(await get_openapi_mcps())]:
        await mcp_app.import_server(
            mcp,
            slugify(mcp.name),
        )
  • Definition of `core_mcp` FastMCP server instance named 'BC', into which tools like `get_uniprot_protein_info` are registered via `@tool()` decorators, later namespaced as 'bc_'.
    core_mcp = FastMCP(  # type: ignore
        "BC",
        instructions="Provides access to biomedical knowledge bases.",
    )
  • Pydantic schema definition for the tool inputs using Annotated and Field for protein_id, protein_name, gene_symbol, species, and include_references.
    def get_uniprot_protein_info(
        protein_id: Annotated[
            Optional[str],
            Field(description="Protein accession number (e.g., 'P04637')"),
        ] = None,
        protein_name: Annotated[
            Optional[str],
            Field(description="Protein name to search for (e.g., 'P53')"),
        ] = None,
        gene_symbol: Annotated[
            Optional[str],
            Field(description="Gene symbol to search for (e.g., 'TP53')"),
        ] = None,
        species: Annotated[
            Optional[str],
            Field(description="Taxonomy ID (e.g., '10090') or species name"),
        ] = None,
        include_references: Annotated[
            bool,
            Field(description="Include references and cross-references in response"),
        ] = False,
    ) -> dict:
  • Exports the `get_uniprot_protein_info` function, allowing it to be imported into the core module for registration.
    from ._get_uniprot_id_by_protein_symbol import get_uniprot_id_by_protein_symbol
    from ._get_uniprot_protein_info import get_uniprot_protein_info
    
    __all__ = [
        "get_uniprot_id_by_protein_symbol",
        "get_uniprot_protein_info",
    ]
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that the tool returns protein information or an error message, which is useful behavioral context. However, it lacks details on rate limits, authentication needs, or potential side effects like data retrieval failures, leaving gaps in transparency for a database query tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by a concise usage note and return format. Every sentence is necessary and efficient, with no wasted words, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (5 parameters, no annotations, but with an output schema), the description is mostly complete. It covers purpose, usage constraints, and return values, but lacks behavioral details like error handling or performance considerations, which could enhance completeness for a database retrieval tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds minimal value by reiterating the need for at least one identifier but does not provide additional syntax or format details beyond what the schema specifies, aligning with the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Retrieve protein information') and resource ('from UniProt database'), distinguishing it from siblings like 'bc_get_alphafold_info_by_protein_symbol' or 'bc_get_uniprot_id_by_protein_symbol' which focus on different data or identifiers. It precisely defines the tool's scope without redundancy.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context by specifying that at least one of protein_id, protein_name, or gene_symbol must be provided, guiding usage. However, it does not explicitly state when to use this tool versus alternatives like 'bc_get_uniprot_id_by_protein_symbol' or other protein-related tools, missing explicit sibling differentiation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/biocontext-ai/knowledgebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server