Skip to main content
Glama
pickleton89

cBioPortal MCP Server

by pickleton89

get_clinical_data

Retrieve patient clinical data from cancer studies with pagination support, enabling researchers to access specific attributes or comprehensive datasets for analysis.

Instructions

Get clinical data for patients in a study with pagination support. Can fetch specific attributes or all.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
study_idYes
attribute_idsNo
page_numberNo
page_sizeNo
sort_byNo
directionNoASC
limitNo

Implementation Reference

  • Registers 'get_clinical_data' (line 119) and other methods as MCP tools by adding them to FastMCP instance.
    """Register tool methods as MCP tools."""
    # List of methods to register as tools (explicitly defined)
    tool_methods = [
        # Pagination utilities
        "paginate_results",
        "collect_all_results",
        # Studies endpoints
        "get_cancer_studies",
        "get_cancer_types",
        "search_studies",
        "get_study_details",
        "get_multiple_studies",
        # Genes endpoints
        "search_genes",
        "get_genes",
        "get_multiple_genes",
        "get_mutations_in_gene",
        # Samples endpoints
        "get_samples_in_study",
        "get_sample_list_id",
        # Molecular profiles endpoints
        "get_molecular_profiles",
        "get_clinical_data",
        "get_gene_panels_for_study",
        "get_gene_panel_details",
    ]
    
    for method_name in tool_methods:
        if hasattr(self, method_name):
            method = getattr(self, method_name)
            self.mcp.add_tool(method)
            logger.debug(f"Registered tool: {method_name}")
        else:
            logger.warning(f"Method {method_name} not found for tool registration")
  • MCP-exposed handler for get_clinical_data tool. Delegates execution to the molecular_profiles endpoint implementation.
    async def get_clinical_data(
        self,
        study_id: str,
        attribute_ids: Optional[List[str]] = None,
        page_number: int = 0,
        page_size: int = 50,
        sort_by: Optional[str] = None,
        direction: str = "ASC",
        limit: Optional[int] = None,
    ) -> Dict:
        """Get clinical data for patients in a study with pagination support. Can fetch specific attributes or all."""
        return await self.molecular_profiles.get_clinical_data(
            study_id, attribute_ids, page_number, page_size, sort_by, direction, limit
        )
  • Primary implementation of get_clinical_data: validates inputs, calls cBioPortal API (GET/POST), processes raw clinical data into nested patient-attribute structure, applies pagination and limits.
    @handle_api_errors("get clinical data")
    async def get_clinical_data(
        self,
        study_id: str,
        attribute_ids: Optional[List[str]] = None,
        page_number: int = 0,
        page_size: int = 50,
        sort_by: Optional[str] = None,
        direction: str = "ASC",
        limit: Optional[int] = None,
    ) -> Dict:
        """
        Get clinical data for patients in a study with pagination support. Can fetch specific attributes or all.
        """
        try:
            api_call_params = {
                "pageNumber": page_number,
                "pageSize": page_size,
                "direction": direction,
                "clinicalDataType": "PATIENT",  # Assuming PATIENT level data
            }
            if sort_by:
                api_call_params["sortBy"] = sort_by
            if limit == 0:
                api_call_params["pageSize"] = FETCH_ALL_PAGE_SIZE
    
            clinical_data_from_api = []
            if attribute_ids:
                endpoint = f"studies/{study_id}/clinical-data/fetch"
                payload = {"attributeIds": attribute_ids, "clinicalDataType": "PATIENT"}
                clinical_data_from_api = await self.api_client.make_api_request(
                    endpoint, method="POST", params=api_call_params, json_data=payload
                )
            else:
                endpoint = f"studies/{study_id}/clinical-data"
                clinical_data_from_api = await self.api_client.make_api_request(
                    endpoint, method="GET", params=api_call_params
                )
    
            if (
                isinstance(clinical_data_from_api, dict)
                and "api_error" in clinical_data_from_api
            ):
                return {
                    "error": "API error fetching clinical data",
                    "details": clinical_data_from_api,
                    "request_params": api_call_params,
                }
            if not isinstance(clinical_data_from_api, list):
                return {
                    "error": "Unexpected API response type for clinical data (expected list)",
                    "details": clinical_data_from_api,
                    "request_params": api_call_params,
                }
    
            api_might_have_more = (
                len(clinical_data_from_api) == api_call_params["pageSize"]
            )
            if (
                api_call_params["pageSize"] == FETCH_ALL_PAGE_SIZE
                and len(clinical_data_from_api) < FETCH_ALL_PAGE_SIZE
            ):
                api_might_have_more = False
    
            # Apply server-side limit to the data that will be processed and returned
            data_to_process = clinical_data_from_api
            if limit and limit > 0 and len(clinical_data_from_api) > limit:
                data_to_process = clinical_data_from_api[:limit]
    
            by_patient = {}
            for item in data_to_process:
                patient_id = item.get("patientId")
                if patient_id:
                    if patient_id not in by_patient:
                        by_patient[patient_id] = {}
                    by_patient[patient_id][item.get("clinicalAttributeId")] = item.get(
                        "value"
                    )
    
            # Update total_found to be the number of unique patients, not raw data items
            # This makes the count consistent with the actual returned data structure
            total_patients = len(by_patient)
    
            return {
                "clinical_data_by_patient": by_patient,  # This contains unique patients with their attributes
                "pagination": {
                    "page": page_number,
                    "page_size": page_size,
                    "total_found": total_patients,  # Now using patient count for consistency
                    "has_more": api_might_have_more,
                },
            }
        except Exception as e:
            return {
                "error": f"Failed to get clinical data for study {study_id}: {str(e)}"
            }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions pagination support, which is useful, but doesn't describe authentication needs, rate limits, error conditions, or what the return format looks like. For a data retrieval tool with 7 parameters, this leaves significant gaps in understanding how the tool behaves.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two sentences that each add value. It's front-loaded with the core purpose and efficiently adds supporting details. No wasted words or redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 7 parameters, 0% schema description coverage, no annotations, and no output schema, the description is incomplete. It doesn't explain return values, error handling, or fully document parameter semantics. The context signals indicate high complexity that the description doesn't adequately address.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It mentions 'pagination support' (hinting at page_number/page_size) and 'specific attributes or all' (hinting at attribute_ids), but doesn't explain the other 5 parameters (study_id, sort_by, direction, limit) or provide format details. This partial coverage is insufficient given the low schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get clinical data for patients in a study' specifies both the verb ('Get') and resource ('clinical data'), and adds scope ('for patients in a study'). However, it doesn't explicitly differentiate from sibling tools like 'get_samples_in_study' or 'get_study_details', which reduces it from a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions 'pagination support' and 'can fetch specific attributes or all', which provides some context for usage. However, it lacks explicit guidance on when to use this tool versus alternatives like 'paginate_results' or 'collect_all_results', and doesn't specify prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pickleton89/cbioportal-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server