get_variant_disruptions
Identifies the top biological annotation disruptions for a genetic variant, ranking molecular features by magnitude of change to explain pathogenic or benign effects.
Instructions
Get the top biological annotation disruptions for a variant.
Shows which molecular features are most affected by the variant, ranked by magnitude of change. Each disruption shows what the Evo 2 model predicts for the reference vs. alternate allele across 325 biological annotations spanning protein structure, chromatin state, regulatory elements, splice sites, and more.
This is the key tool for understanding WHY a variant is predicted pathogenic or benign — e.g., a splice-site variant might show large disruptions in splice donor/acceptor annotations, while a missense variant might show disruptions in protein domain and secondary structure annotations.
Categories: amino_acid, atacseq, ccre, chipseq, chromhmm, elm, fstack, protein_feature, interpro, genomic_feature, ptm, region, secondary_structure.
Args: variant_id: Variant identifier in chr:pos:ref:alt format. top_n: Number of top disruptions to return (default 15, max 100). category: Optional category filter — restrict ranking to one category (e.g. to see only splice-related disruptions: category='genomic_feature').
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| variant_id | Yes | ||
| top_n | No | ||
| category | No |
Implementation Reference
- server.py:524-574 (handler)The main tool handler function for 'get_variant_disruptions'. It fetches variant data from the API, optionally validates the category, calls _extract_top_disruptions to rank disruptions, and returns a structured response with variant info, pathogenicity score, and disruption list.
def get_variant_disruptions(variant_id: str, top_n: int = 15, category: str | None = None) -> dict: """Get the top biological annotation disruptions for a variant. Shows which molecular features are most affected by the variant, ranked by magnitude of change. Each disruption shows what the Evo 2 model predicts for the reference vs. alternate allele across 325 biological annotations spanning protein structure, chromatin state, regulatory elements, splice sites, and more. This is the key tool for understanding WHY a variant is predicted pathogenic or benign — e.g., a splice-site variant might show large disruptions in splice donor/acceptor annotations, while a missense variant might show disruptions in protein domain and secondary structure annotations. Categories: amino_acid, atacseq, ccre, chipseq, chromhmm, elm, fstack, protein_feature, interpro, genomic_feature, ptm, region, secondary_structure. Args: variant_id: Variant identifier in chr:pos:ref:alt format. top_n: Number of top disruptions to return (default 15, max 100). category: Optional category filter — restrict ranking to one category (e.g. to see only splice-related disruptions: category='genomic_feature'). """ if category is not None and category not in ANNOTATION_CATEGORIES: return {"error": f"Unknown category '{category}'. Valid categories: {', '.join(sorted(ANNOTATION_CATEGORIES))}"} top_n = min(max(1, top_n), 100) with _get_client() as client: resp = client.get(f"/variants/{variant_id}") if resp.status_code == 404: return {"error": "Variant not found", "variant_id": variant_id} resp.raise_for_status() data = resp.json() disruptions = _extract_top_disruptions(data, top_n, category) vid = data.get("variant_id") result = { "variant_id": vid, "evee_url": _evee_url(vid), "gene": data.get("gene_name"), "consequence": data.get("consequence_display"), "pathogenicity_score": data.get("pathogenicity") or data.get("score"), "disruption_count": len(disruptions), "disruptions": disruptions, } if category: result["category"] = category return result - server.py:182-221 (helper)Helper function that extracts and ranks annotation disruptions by magnitude of change (abs_delta). Filters by category prefix if specified, computes delta = alt - ref for each annotation, sorts by absolute delta descending, and returns the top N disruptions.
def _extract_top_disruptions(data: dict, top_n: int, category: str | None = None) -> list[dict]: """Extract and rank annotation disruptions by magnitude of change.""" filter_prefixes = ANNOTATION_CATEGORIES[category]["prefixes"] if category else None ref_keys = sorted(k for k in data if k.startswith("ref_") and not k[0].isdigit()) disruptions = [] for rk in ref_keys: name = rk[4:] if filter_prefixes and not any(name.startswith(p) for p in filter_prefixes): continue vk = f"var_{name}" mk = f"maxpos_{name}" ref_val = data.get(rk) var_val = data.get(vk) if ref_val is None or var_val is None: continue if not isinstance(ref_val, (int, float)) or not isinstance(var_val, (int, float)): continue delta = var_val - ref_val abs_delta = abs(delta) if abs_delta < 0.001: continue cat = "other" for cat_name, cat_info in ANNOTATION_CATEGORIES.items(): if any(name.startswith(p) for p in cat_info["prefixes"]): cat = cat_name break disruptions.append({ "annotation": name, "category": cat, "ref": round(ref_val, 4), "alt": round(var_val, 4), "delta": round(delta, 4), "abs_delta": round(abs_delta, 4), "max_disruption_position": data.get(mk), }) disruptions.sort(key=lambda x: x["abs_delta"], reverse=True) return disruptions[:top_n] - server.py:63-116 (schema)Defines ANNOTATION_CATEGORIES, the mapping of category names to prefixes and descriptions. Used by the handler for validating the category parameter and by _extract_top_disruptions for filtering and labeling disruptions.
ANNOTATION_CATEGORIES = { "amino_acid": { "prefixes": ["amino_acid_"], "description": "Predicted amino acid probabilities at the variant position (20 standard amino acids)", }, "atacseq": { "prefixes": ["atacseq_"], "description": "ATAC-seq chromatin accessibility peaks across 7 tissues/cell types", }, "ccre": { "prefixes": ["ccre_"], "description": "ENCODE candidate cis-regulatory element annotations (enhancers, promoters, CTCF, etc.)", }, "chipseq": { "prefixes": ["chipseq_"], "description": "ChIP-seq histone modification peaks (H3K27ac, H3K27me3, H3K36me3, H3K4me1, H3K4me3, H3K9me3) across tissues", }, "chromhmm": { "prefixes": ["chromhmm_"], "description": "ChromHMM chromatin state predictions (active TSS, bivalent, enhancer, repressed, transcribed) across 10 cell types", }, "elm": { "prefixes": ["elm_"], "description": "Eukaryotic Linear Motif predictions (DOC, LIG, MOD, TRG)", }, "fstack": { "prefixes": ["fstack_"], "description": "FStack functional state predictions (enhancer, promoter, quiescent, repressed, transcribed)", }, "protein_feature": { "prefixes": ["in_"], "description": "Protein structural features from UniProt (domain, disulfide bond, transmembrane, coiled coil, active/binding sites, etc.)", }, "interpro": { "prefixes": ["interpro_"], "description": "InterPro protein domain family predictions (117 domain types)", }, "genomic_feature": { "prefixes": ["is_"], "description": "Binary genomic feature flags (splice donor/acceptor, CpG island, repeat elements, exon-intron boundaries, etc.)", }, "ptm": { "prefixes": ["ptm_"], "description": "Post-translational modification predictions (acetylation, glycosylation, methylation, phosphorylation, sumoylation, ubiquitination)", }, "region": { "prefixes": ["region_"], "description": "Genomic region annotations (CDS, intron, 3'UTR, 5'UTR)", }, "secondary_structure": { "prefixes": ["secondary_structure_"], "description": "Protein secondary structure predictions (C=coil, E=strand, H=helix)", }, } - server.py:523-524 (registration)The tool is registered via the @mcp.tool() decorator on the get_variant_disruptions function. FastMCP automatically registers all decorated functions as MCP tools.
@mcp.tool() def get_variant_disruptions(variant_id: str, top_n: int = 15, category: str | None = None) -> dict: