bc_get_uniprot_protein_info
Retrieve protein details from UniProt database using protein ID, name, or gene symbol to access accession, sequence, functions, and organism information.
Instructions
Retrieve protein information from UniProt database. Provide at least one of protein_id, protein_name, or gene_symbol.
Returns: dict: Protein information with accession, proteinDescription, genes, organism, sequence, functions, keywords, references or error message.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| protein_id | No | Protein accession number (e.g., 'P04637') | |
| protein_name | No | Protein name to search for (e.g., 'P53') | |
| gene_symbol | No | Gene symbol to search for (e.g., 'TP53') | |
| species | No | Taxonomy ID (e.g., '10090') or species name | |
| include_references | No | Include references and cross-references in response |
Implementation Reference
- The handler function `get_uniprot_protein_info` for the tool, decorated with `@core_mcp.tool()`, which queries the UniProt REST API to retrieve protein information based on ID, name, gene symbol, species, and includes references optionally.@core_mcp.tool() def get_uniprot_protein_info( protein_id: Annotated[ Optional[str], Field(description="Protein accession number (e.g., 'P04637')"), ] = None, protein_name: Annotated[ Optional[str], Field(description="Protein name to search for (e.g., 'P53')"), ] = None, gene_symbol: Annotated[ Optional[str], Field(description="Gene symbol to search for (e.g., 'TP53')"), ] = None, species: Annotated[ Optional[str], Field(description="Taxonomy ID (e.g., '10090') or species name"), ] = None, include_references: Annotated[ bool, Field(description="Include references and cross-references in response"), ] = False, ) -> dict: """Retrieve protein information from UniProt database. Provide at least one of protein_id, protein_name, or gene_symbol. Returns: dict: Protein information with accession, proteinDescription, genes, organism, sequence, functions, keywords, references or error message. """ base_url = "https://rest.uniprot.org/uniprotkb/search" # Ensure at least one search parameter was provided if not protein_id and not protein_name and not gene_symbol: return {"error": "At least one of protein_id or protein_name or gene_symbol must be provided."} query_parts = [] if protein_id: query_parts.append(f"accession:{protein_id}") elif protein_name: query_parts.append(f"protein_name:{protein_name}") elif gene_symbol: query_parts.append(f"gene:{gene_symbol}") if species: species = str(species).strip() # Try to determine if it's a taxonomy ID (numeric) or a name if species.isdigit(): query_parts.append(f"organism_id:{species}") else: query_parts.append(f'taxonomy_name:"{species}"') query = " AND ".join(query_parts) params: dict[str, str | int] = { "query": query, "format": "json", } try: response = requests.get(base_url, params=params) response.raise_for_status() result = response.json() if not result.get("results"): return {"error": "No results found for the given query."} first_result = result["results"][0] # Remove references and cross-references by default to reduce response size if not include_references: first_result.pop("references", None) first_result.pop("uniProtKBCrossReferences", None) return first_result except Exception as e: return {"error": f"Exception occurred: {e!s}"}
- src/biocontext_kb/app.py:35-39 (registration)Registration of the `core_mcp` server (containing the uniprot tool) into the main `mcp_app` with namespace prefix `slugify('BC') = 'bc'`, resulting in the tool name `bc_get_uniprot_protein_info`.for mcp in [core_mcp, *(await get_openapi_mcps())]: await mcp_app.import_server( mcp, slugify(mcp.name), )
- src/biocontext_kb/core/_server.py:3-6 (registration)Definition of `core_mcp` FastMCP server instance named 'BC', into which tools like `get_uniprot_protein_info` are registered via `@tool()` decorators, later namespaced as 'bc_'.core_mcp = FastMCP( # type: ignore "BC", instructions="Provides access to biomedical knowledge bases.", )
- Pydantic schema definition for the tool inputs using Annotated and Field for protein_id, protein_name, gene_symbol, species, and include_references.def get_uniprot_protein_info( protein_id: Annotated[ Optional[str], Field(description="Protein accession number (e.g., 'P04637')"), ] = None, protein_name: Annotated[ Optional[str], Field(description="Protein name to search for (e.g., 'P53')"), ] = None, gene_symbol: Annotated[ Optional[str], Field(description="Gene symbol to search for (e.g., 'TP53')"), ] = None, species: Annotated[ Optional[str], Field(description="Taxonomy ID (e.g., '10090') or species name"), ] = None, include_references: Annotated[ bool, Field(description="Include references and cross-references in response"), ] = False, ) -> dict:
- Exports the `get_uniprot_protein_info` function, allowing it to be imported into the core module for registration.from ._get_uniprot_id_by_protein_symbol import get_uniprot_id_by_protein_symbol from ._get_uniprot_protein_info import get_uniprot_protein_info __all__ = [ "get_uniprot_id_by_protein_symbol", "get_uniprot_protein_info", ]