Skip to main content
Glama

get_variant_disruptions

Identifies the top biological annotation disruptions for a genetic variant, ranking molecular features by magnitude of change to explain pathogenic or benign effects.

Instructions

Get the top biological annotation disruptions for a variant.

Shows which molecular features are most affected by the variant, ranked by magnitude of change. Each disruption shows what the Evo 2 model predicts for the reference vs. alternate allele across 325 biological annotations spanning protein structure, chromatin state, regulatory elements, splice sites, and more.

This is the key tool for understanding WHY a variant is predicted pathogenic or benign — e.g., a splice-site variant might show large disruptions in splice donor/acceptor annotations, while a missense variant might show disruptions in protein domain and secondary structure annotations.

Categories: amino_acid, atacseq, ccre, chipseq, chromhmm, elm, fstack, protein_feature, interpro, genomic_feature, ptm, region, secondary_structure.

Args: variant_id: Variant identifier in chr:pos:ref:alt format. top_n: Number of top disruptions to return (default 15, max 100). category: Optional category filter — restrict ranking to one category (e.g. to see only splice-related disruptions: category='genomic_feature').

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
variant_idYes
top_nNo
categoryNo

Implementation Reference

  • The main tool handler function for 'get_variant_disruptions'. It fetches variant data from the API, optionally validates the category, calls _extract_top_disruptions to rank disruptions, and returns a structured response with variant info, pathogenicity score, and disruption list.
    def get_variant_disruptions(variant_id: str, top_n: int = 15, category: str | None = None) -> dict:
        """Get the top biological annotation disruptions for a variant.
    
        Shows which molecular features are most affected by the variant, ranked by
        magnitude of change. Each disruption shows what the Evo 2 model predicts
        for the reference vs. alternate allele across 325 biological annotations
        spanning protein structure, chromatin state, regulatory elements, splice
        sites, and more.
    
        This is the key tool for understanding WHY a variant is predicted pathogenic
        or benign — e.g., a splice-site variant might show large disruptions in
        splice donor/acceptor annotations, while a missense variant might show
        disruptions in protein domain and secondary structure annotations.
    
        Categories: amino_acid, atacseq, ccre, chipseq, chromhmm, elm, fstack,
        protein_feature, interpro, genomic_feature, ptm, region, secondary_structure.
    
        Args:
            variant_id: Variant identifier in chr:pos:ref:alt format.
            top_n: Number of top disruptions to return (default 15, max 100).
            category: Optional category filter — restrict ranking to one category
                      (e.g. to see only splice-related disruptions:
                      category='genomic_feature').
        """
        if category is not None and category not in ANNOTATION_CATEGORIES:
            return {"error": f"Unknown category '{category}'. Valid categories: {', '.join(sorted(ANNOTATION_CATEGORIES))}"}
    
        top_n = min(max(1, top_n), 100)
    
        with _get_client() as client:
            resp = client.get(f"/variants/{variant_id}")
            if resp.status_code == 404:
                return {"error": "Variant not found", "variant_id": variant_id}
            resp.raise_for_status()
            data = resp.json()
    
        disruptions = _extract_top_disruptions(data, top_n, category)
    
        vid = data.get("variant_id")
        result = {
            "variant_id": vid,
            "evee_url": _evee_url(vid),
            "gene": data.get("gene_name"),
            "consequence": data.get("consequence_display"),
            "pathogenicity_score": data.get("pathogenicity") or data.get("score"),
            "disruption_count": len(disruptions),
            "disruptions": disruptions,
        }
        if category:
            result["category"] = category
        return result
  • Helper function that extracts and ranks annotation disruptions by magnitude of change (abs_delta). Filters by category prefix if specified, computes delta = alt - ref for each annotation, sorts by absolute delta descending, and returns the top N disruptions.
    def _extract_top_disruptions(data: dict, top_n: int, category: str | None = None) -> list[dict]:
        """Extract and rank annotation disruptions by magnitude of change."""
        filter_prefixes = ANNOTATION_CATEGORIES[category]["prefixes"] if category else None
        ref_keys = sorted(k for k in data if k.startswith("ref_") and not k[0].isdigit())
        disruptions = []
        for rk in ref_keys:
            name = rk[4:]
            if filter_prefixes and not any(name.startswith(p) for p in filter_prefixes):
                continue
            vk = f"var_{name}"
            mk = f"maxpos_{name}"
            ref_val = data.get(rk)
            var_val = data.get(vk)
            if ref_val is None or var_val is None:
                continue
            if not isinstance(ref_val, (int, float)) or not isinstance(var_val, (int, float)):
                continue
            delta = var_val - ref_val
            abs_delta = abs(delta)
            if abs_delta < 0.001:
                continue
    
            cat = "other"
            for cat_name, cat_info in ANNOTATION_CATEGORIES.items():
                if any(name.startswith(p) for p in cat_info["prefixes"]):
                    cat = cat_name
                    break
    
            disruptions.append({
                "annotation": name,
                "category": cat,
                "ref": round(ref_val, 4),
                "alt": round(var_val, 4),
                "delta": round(delta, 4),
                "abs_delta": round(abs_delta, 4),
                "max_disruption_position": data.get(mk),
            })
    
        disruptions.sort(key=lambda x: x["abs_delta"], reverse=True)
        return disruptions[:top_n]
  • Defines ANNOTATION_CATEGORIES, the mapping of category names to prefixes and descriptions. Used by the handler for validating the category parameter and by _extract_top_disruptions for filtering and labeling disruptions.
    ANNOTATION_CATEGORIES = {
        "amino_acid": {
            "prefixes": ["amino_acid_"],
            "description": "Predicted amino acid probabilities at the variant position (20 standard amino acids)",
        },
        "atacseq": {
            "prefixes": ["atacseq_"],
            "description": "ATAC-seq chromatin accessibility peaks across 7 tissues/cell types",
        },
        "ccre": {
            "prefixes": ["ccre_"],
            "description": "ENCODE candidate cis-regulatory element annotations (enhancers, promoters, CTCF, etc.)",
        },
        "chipseq": {
            "prefixes": ["chipseq_"],
            "description": "ChIP-seq histone modification peaks (H3K27ac, H3K27me3, H3K36me3, H3K4me1, H3K4me3, H3K9me3) across tissues",
        },
        "chromhmm": {
            "prefixes": ["chromhmm_"],
            "description": "ChromHMM chromatin state predictions (active TSS, bivalent, enhancer, repressed, transcribed) across 10 cell types",
        },
        "elm": {
            "prefixes": ["elm_"],
            "description": "Eukaryotic Linear Motif predictions (DOC, LIG, MOD, TRG)",
        },
        "fstack": {
            "prefixes": ["fstack_"],
            "description": "FStack functional state predictions (enhancer, promoter, quiescent, repressed, transcribed)",
        },
        "protein_feature": {
            "prefixes": ["in_"],
            "description": "Protein structural features from UniProt (domain, disulfide bond, transmembrane, coiled coil, active/binding sites, etc.)",
        },
        "interpro": {
            "prefixes": ["interpro_"],
            "description": "InterPro protein domain family predictions (117 domain types)",
        },
        "genomic_feature": {
            "prefixes": ["is_"],
            "description": "Binary genomic feature flags (splice donor/acceptor, CpG island, repeat elements, exon-intron boundaries, etc.)",
        },
        "ptm": {
            "prefixes": ["ptm_"],
            "description": "Post-translational modification predictions (acetylation, glycosylation, methylation, phosphorylation, sumoylation, ubiquitination)",
        },
        "region": {
            "prefixes": ["region_"],
            "description": "Genomic region annotations (CDS, intron, 3'UTR, 5'UTR)",
        },
        "secondary_structure": {
            "prefixes": ["secondary_structure_"],
            "description": "Protein secondary structure predictions (C=coil, E=strand, H=helix)",
        },
    }
  • server.py:523-524 (registration)
    The tool is registered via the @mcp.tool() decorator on the get_variant_disruptions function. FastMCP automatically registers all decorated functions as MCP tools.
    @mcp.tool()
    def get_variant_disruptions(variant_id: str, top_n: int = 15, category: str | None = None) -> dict:
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden and does well: discloses it returns ranked disruptions from Evo 2 model across 325 annotations, mentions categories. However, it does not address error behavior (e.g., invalid variant_id) or provide details about output structure, but overall it is transparent about what the tool does.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections: purpose, explanation, categories list, and parameters. Slightly verbose with the categories list and example, but all information is useful. The front-loaded sentence effectively states the action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While param descriptions are solid and the output concept is explained, the absence of an output schema means the description should ideally detail the returned fields or format. It only mentions 'shows which molecular features are most affected' which is vague. Could be more complete for a tool that returns structured data.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, but the description fully compensates: specifies variant_id format, top_n default and max, and category filter with an example. This adds essential meaning beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it retrieves top biological annotation disruptions for a variant. The explanation of ranking and categories adds clarity. However, it does not explicitly differentiate from sibling tools like get_variant_annotations, which might also provide disruption info, so purpose is clear but not fully distinguished.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides context that this is the key tool for understanding pathogenicity, and includes example usage with category filter. However, it does not specify when not to use it or compare it directly to sibling tools, leaving some ambiguity about choosing among similar tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/goodfire-ai/evee-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server