Skip to main content
Glama
biocontext-ai

BioContextAI Knowledgebase MCP

Official

bc_search_pride_proteins

Search for specific proteins within PRIDE mass spectrometry proteomics datasets using project accession numbers and optional keywords to identify proteins, genes, and modifications.

Instructions

Search for proteins identified in a specific PRIDE mass spectrometry project. Useful for finding specific proteins in proteomics datasets.

Returns: dict: Proteins list with accessions, names, genes, sequences, modifications, associated projects or error message.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
project_accessionYesPRIDE project accession to search proteins in
keywordNoSearch keyword for protein names or accessions
page_sizeNoNumber of results to return (max 100)
sort_fieldNoSort field: accession, proteinName, or geneaccession
sort_directionNoSort direction: ASC or DESCASC

Implementation Reference

  • Main handler function for the bc_search_pride_proteins tool (registered as search_pride_proteins). Queries PRIDE API for proteins in a project, supports keyword search, sorting, pagination. Includes input schema via Annotated Fields.
    @core_mcp.tool()
    def search_pride_proteins(
        project_accession: Annotated[
            str,
            Field(description="PRIDE project accession to search proteins in"),
        ],
        keyword: Annotated[
            Optional[str],
            Field(description="Search keyword for protein names or accessions"),
        ] = None,
        page_size: Annotated[
            int,
            Field(description="Number of results to return (max 100)"),
        ] = 20,
        sort_field: Annotated[
            str,
            Field(description="Sort field: accession, proteinName, or gene"),
        ] = "accession",
        sort_direction: Annotated[
            str,
            Field(description="Sort direction: ASC or DESC"),
        ] = "ASC",
    ) -> dict:
        """Search for proteins identified in a specific PRIDE mass spectrometry project. Useful for finding specific proteins in proteomics datasets.
    
        Returns:
            dict: Proteins list with accessions, names, genes, sequences, modifications, associated projects or error message.
        """
        base_url = "https://www.ebi.ac.uk/pride/ws/archive/v3/pride-ap/search/proteins"
    
        # Build query parameters
        params: dict[str, str | int] = {"projectAccession": project_accession}
    
        if page_size > 100:
            page_size = 100
        params["pageSize"] = page_size
        params["page"] = 0
    
        # Add keyword search
        if keyword:
            params["keyword"] = keyword
    
        # Validate and set sort parameters
        valid_sort_fields = ["accession", "proteinName", "gene"]
        if sort_field not in valid_sort_fields:
            sort_field = "accession"
        params["sortField"] = sort_field
    
        valid_sort_directions = ["ASC", "DESC"]
        if sort_direction.upper() not in valid_sort_directions:
            sort_direction = "ASC"
        params["sortDirection"] = sort_direction.upper()
    
        try:
            response = requests.get(base_url, params=params)
            response.raise_for_status()
    
            search_results = response.json()
    
            if not search_results:
                return {"results": [], "count": 0, "message": f"No proteins found in PRIDE project {project_accession}"}
    
            # Process results to include key information
            processed_results = []
            for protein in search_results:
                processed_protein = {
                    "protein_accession": protein.get("proteinAccession"),
                    "protein_name": protein.get("proteinName"),
                    "gene": protein.get("gene"),
                    "project_count": protein.get("projectCount", 0),
                }
                processed_results.append(processed_protein)
    
            return {
                "results": processed_results,
                "count": len(processed_results),
                "project_accession": project_accession,
                "search_criteria": {"keyword": keyword, "sort_field": sort_field, "sort_direction": sort_direction},
            }
    
        except requests.exceptions.HTTPError as e:
            if e.response.status_code == 404:
                return {"error": f"PRIDE project {project_accession} not found or has no protein data"}
            return {"error": f"HTTP error: {e}"}
        except Exception as e:
            return {"error": f"Exception occurred: {e!s}"}
  • Registers pride tools by importing from .pride which includes search_pride_proteins decorated with @core_mcp.tool().
    from .pride import *
  • Explicit import of the search_pride_proteins function in pride module for higher-level import.
    from ._search_pride_proteins import search_pride_proteins
  • Defines the FastMCP instance 'core_mcp' with prefix 'BC' used to register all tools including search_pride_proteins via @core_mcp.tool() decorator. Tool names likely prefixed as 'bc_search_pride_proteins'.
    core_mcp = FastMCP(  # type: ignore
        "BC",
        instructions="Provides access to biomedical knowledge bases.",
    )

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/biocontext-ai/knowledgebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server