bc_get_string_similarity_scores
Retrieve homology-based similarity scores between proteins from the STRING database using Smith-Waterman bit scores, focusing on pairs with scores above 50.
Instructions
Get similarity scores between proteins from the STRING database.
The scores represent protein homology based on Smith-Waterman bit scores. Only scores above 50 are reported, and only half of the similarity matrix (since it's symmetric) plus self-hits are returned.
Args: protein_symbol (str): The protein symbol of the first protein (e.g., "TP53"). protein_symbol_comparison (str): The protein symbol of the second protein (e.g., "MKI67"). species (str): The species taxonomy ID (e.g., "9606" for human). Optional.
Returns: list: A list of dictionaries containing protein pairs and their bit scores.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| protein_symbol | Yes | The protein symbol of the first protein (e.g., 'TP53') | |
| protein_symbol_comparison | Yes | The protein symbol of the second protein (e.g., 'MKI67') | |
| species | No | The species taxonomy ID (e.g., '9606' for human) |
Implementation Reference
- The handler function for bc_get_string_similarity_scores (registered as get_string_similarity_scores under 'BC' namespace). Resolves STRING IDs for two proteins and fetches Smith-Waterman homology bitscores from STRING DB API.@core_mcp.tool() def get_string_similarity_scores( protein_symbol: Annotated[str, Field(description="First protein symbol (e.g., 'TP53')")], protein_symbol_comparison: Annotated[str, Field(description="Second protein symbol (e.g., 'MKI67')")], species: Annotated[str, Field(description="Species taxonomy ID (e.g., '9606' for human)")] = "", ) -> Union[List[Dict[str, Any]], dict]: """Retrieve protein homology similarity scores from STRING database based on Smith-Waterman bit scores. Only scores above 50 reported. Returns: list or dict: Similarity scores array with stringId_A, stringId_B, bitscore or error message. """ # Resolve both protein symbols to STRING IDs try: string_id1 = get_string_id.fn(protein_symbol=protein_symbol, species=species) string_id2 = get_string_id.fn(protein_symbol=protein_symbol_comparison, species=species) if not all(isinstance(string_id, str) for string_id in [string_id1, string_id2]): return {"error": "Could not extract STRING IDs"} identifiers = f"{string_id1}%0d{string_id2}" url = f"https://string-db.org/api/json/homology?identifiers={identifiers}" if species: url += f"&species={species}" response = requests.get(url) response.raise_for_status() return response.json() except requests.exceptions.RequestException as e: return {"error": f"Failed to fetch similarity scores: {e!s}"} except Exception as e: return {"error": f"An error occurred: {e!s}"}
- src/biocontext_kb/core/_server.py:3-6 (registration)Defines core_mcp FastMCP server instance named 'BC', under which this tool is registered. Later imported into main app with prefix 'bc_'.core_mcp = FastMCP( # type: ignore "BC", instructions="Provides access to biomedical knowledge bases.", )
- src/biocontext_kb/core/__init__.py:19-19 (registration)Imports all STRINGdb tools (including get_string_similarity_scores) into core_mcp scope for registration.from .stringdb import *
- Exposes the tool function for import in higher modules.from ._get_string_similarity_scores import get_string_similarity_scores
- Pydantic schema defined inline via Annotated and Field for input parameters and return type.protein_symbol: Annotated[str, Field(description="First protein symbol (e.g., 'TP53')")], protein_symbol_comparison: Annotated[str, Field(description="Second protein symbol (e.g., 'MKI67')")], species: Annotated[str, Field(description="Species taxonomy ID (e.g., '9606' for human)")] = "", ) -> Union[List[Dict[str, Any]], dict]: """Retrieve protein homology similarity scores from STRING database based on Smith-Waterman bit scores. Only scores above 50 reported. Returns: list or dict: Similarity scores array with stringId_A, stringId_B, bitscore or error message. """ # Resolve both protein symbols to STRING IDs try: string_id1 = get_string_id.fn(protein_symbol=protein_symbol, species=species) string_id2 = get_string_id.fn(protein_symbol=protein_symbol_comparison, species=species) if not all(isinstance(string_id, str) for string_id in [string_id1, string_id2]): return {"error": "Could not extract STRING IDs"} identifiers = f"{string_id1}%0d{string_id2}" url = f"https://string-db.org/api/json/homology?identifiers={identifiers}" if species: url += f"&species={species}" response = requests.get(url) response.raise_for_status() return response.json() except requests.exceptions.RequestException as e: return {"error": f"Failed to fetch similarity scores: {e!s}"} except Exception as e: return {"error": f"An error occurred: {e!s}"}