Skip to main content
Glama
biocontext-ai

BioContextAI Knowledgebase MCP

Official

bc_get_protein_domains

Retrieve protein domain architecture and InterPro matches to analyze functional sites and structural organization from UniProt databases.

Instructions

Get protein domain architecture and InterPro matches. Returns all InterPro domains, functional sites, and domain architecture.

Returns: dict: Protein metadata with interpro_matches array, interpro_match_count, domain_architecture, optionally structure data or error message.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
protein_idYesUniProt ID/accession (e.g., 'P04637' or 'CYC_HUMAN')
source_dbNoDatabase source ('uniprot', 'reviewed', or 'unreviewed')uniprot
include_structure_infoNoInclude structural information
species_filterNoTaxonomy ID filter (e.g., '9606' for human)

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • Handler function decorated with @core_mcp.tool() implementing the logic to retrieve protein domains from InterPro API, including matches, architecture, and optional structure info. Tool name: 'get_protein_domains' (likely 'bc_get_protein_domains' under 'BC' MCP server).
    @core_mcp.tool()
    def get_protein_domains(
        protein_id: Annotated[
            str,
            Field(description="UniProt ID/accession (e.g., 'P04637' or 'CYC_HUMAN')"),
        ],
        source_db: Annotated[
            str,
            Field(description="Database source ('uniprot', 'reviewed', or 'unreviewed')"),
        ] = "uniprot",
        include_structure_info: Annotated[
            bool,
            Field(description="Include structural information"),
        ] = False,
        species_filter: Annotated[
            Optional[str],
            Field(description="Taxonomy ID filter (e.g., '9606' for human)"),
        ] = None,
    ) -> dict:
        """Get protein domain architecture and InterPro matches. Returns all InterPro domains, functional sites, and domain architecture.
    
        Returns:
            dict: Protein metadata with interpro_matches array, interpro_match_count, domain_architecture, optionally structure data or error message.
        """
        base_url = f"https://www.ebi.ac.uk/interpro/api/protein/{source_db}/{protein_id}"
    
        # Build query parameters
        params = {}
        extra_fields = ["description", "sequence"]
    
        if include_structure_info:
            extra_fields.append("structure")
    
        if species_filter:
            params["tax_id"] = species_filter
    
        params["extra_fields"] = ",".join(extra_fields)
    
        try:
            # Get protein information with InterPro matches
            response = requests.get(base_url, params=params)
            response.raise_for_status()
    
            protein_data = response.json()
    
            if not protein_data.get("metadata"):
                return {"error": f"No data found for protein {protein_id}"}
    
            result = protein_data["metadata"]
    
            # Get InterPro domain matches for this protein
            try:
                domains_url = f"https://www.ebi.ac.uk/interpro/api/entry/interpro/protein/{source_db}/{protein_id}"
                domains_response = requests.get(domains_url)
    
                if domains_response.status_code == 200:
                    domains_data = domains_response.json()
                    result["interpro_matches"] = domains_data.get("results", [])
                    result["interpro_match_count"] = len(domains_data.get("results", []))
                else:
                    result["interpro_matches"] = []
                    result["interpro_match_count"] = 0
    
            except Exception as e:
                result["interpro_matches"] = {"error": f"Could not fetch InterPro matches: {e}"}
    
            # Get domain architecture ID (IDA) if available
            try:
                ida_url = f"{base_url}?ida"
                ida_response = requests.get(ida_url)
                if ida_response.status_code == 200:
                    ida_data = ida_response.json()
                    if ida_data.get("metadata", {}).get("ida"):
                        result["domain_architecture"] = ida_data["metadata"]["ida"]
            except Exception:
                pass  # IDA is optional
    
            return result
    
        except requests.exceptions.HTTPError as e:
            if e.response.status_code == 404:
                return {"error": f"Protein {protein_id} not found in {source_db}"}
            return {"error": f"HTTP error: {e}"}
        except Exception as e:
            return {"error": f"Exception occurred: {e!s}"}
  • Input schema defined using Pydantic Annotated types and Field descriptions for the tool parameters.
    def get_protein_domains(
        protein_id: Annotated[
            str,
            Field(description="UniProt ID/accession (e.g., 'P04637' or 'CYC_HUMAN')"),
        ],
        source_db: Annotated[
            str,
            Field(description="Database source ('uniprot', 'reviewed', or 'unreviewed')"),
        ] = "uniprot",
        include_structure_info: Annotated[
            bool,
            Field(description="Include structural information"),
        ] = False,
        species_filter: Annotated[
            Optional[str],
            Field(description="Taxonomy ID filter (e.g., '9606' for human)"),
        ] = None,
    ) -> dict:
  • Imports the get_protein_domains function, triggering its registration via the @tool decorator when the module is imported.
    from ._get_protein_domains import get_protein_domains
  • Imports all interpro tools, including get_protein_domains, into the core namespace for registration on core_mcp.
    from .interpro import *
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses the return type ('dict') and content ('Protein metadata with interpro_matches array, interpro_match_count, domain_architecture, optionally structure data or error message'), which adds value beyond the input schema. However, it omits details like rate limits, authentication needs, or error handling specifics, leaving behavioral gaps for a tool with no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized with two sentences: one stating the purpose and one detailing the return. It is front-loaded with the core function and avoids unnecessary elaboration. However, the second sentence could be slightly more streamlined, but overall, it's efficient with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 parameters, 1 required), 100% schema coverage, and the presence of an output schema (implied by 'Returns: dict'), the description is reasonably complete. It covers the purpose and return values adequately, though it could benefit from more behavioral context (e.g., error cases or performance notes) to fully compensate for the lack of annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description does not add any parameter-specific semantics beyond what the schema provides (e.g., it doesn't explain interactions between parameters like 'source_db' and 'include_structure_info'). This meets the baseline for high schema coverage but doesn't enhance understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get protein domain architecture and InterPro matches.' It specifies the verb ('Get') and resource ('protein domain architecture and InterPro matches'), making the function evident. However, it does not explicitly differentiate from sibling tools like 'bc_get_interpro_entry' or 'bc_get_uniprot_protein_info,' which might overlap in domain, so it misses full sibling distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It lacks any mention of prerequisites, context, or comparisons to sibling tools such as 'bc_get_interpro_entry' or 'bc_get_uniprot_protein_info,' which could offer similar or complementary data. This absence leaves the agent without usage direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/biocontext-ai/knowledgebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server