rank_genes_groups
Identify and rank differentially expressed genes across groups for single-cell RNA sequencing analysis, enabling targeted insights into gene expression patterns and group characterization.
Instructions
Rank genes for characterizing groups, perform differentially expressison analysis
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| corr_method | No | p-value correction method. Used only for 't-test', 't-test_overestim_var', and 'wilcoxon'. | benjamini-hochberg |
| groupby | Yes | The key of the observations grouping to consider. | |
| groups | No | Subset of groups to which comparison shall be restricted, or 'all' for all groups. | all |
| key_added | No | The key in adata.uns information is saved to. | |
| layer | No | Key from adata.layers whose value will be used to perform tests on. | |
| mask_var | No | Select subset of genes to use in statistical tests. | |
| method | No | Method for differential expression analysis. Default is 't-test'. | |
| n_genes | No | The number of genes that appear in the returned tables. Defaults to all genes. | |
| pts | No | Compute the fraction of cells expressing the genes. | |
| rankby_abs | No | Rank genes by the absolute value of the score, not by the score. | |
| reference | No | If 'rest', compare each group to the union of the rest of the group. If a group identifier, compare with respect to this group. | rest |
| tie_correct | No | Use tie correction for 'wilcoxon' scores. Used only for 'wilcoxon'. | |
| use_raw | No | Use raw attribute of adata if present. |
Implementation Reference
- src/scmcp/tool/tl.py:164-177 (handler)Generic handler function for all tl tools, including 'rank_genes_groups'. It retrieves the Scanpy function from tl_func dict and calls it with validated arguments on the active AnnData object.def run_tl_func(ads, func, arguments): adata = ads.adata_dic[ads.active] if func not in tl_func: raise ValueError(f"Unsupported function: {func}") run_func = tl_func[func] parameters = inspect.signature(run_func).parameters kwargs = {k: arguments.get(k) for k in parameters if k in arguments} try: res = run_func(adata, **kwargs) add_op_log(adata, run_func, kwargs) except Exception as e: logger.error(f"Error running function {func}: {e}") raise return
- src/scmcp/tool/tl.py:125-142 (handler)Mapping dictionary tl_func that associates 'rank_genes_groups' tool name to scanpy.tl.rank_genes_groups function, used by the handler.tl_func = { "tsne": sc.tl.tsne, "umap": sc.tl.umap, "draw_graph": sc.tl.draw_graph, "diffmap": sc.tl.diffmap, "embedding_density": sc.tl.embedding_density, "leiden": sc.tl.leiden, "louvain": sc.tl.louvain, "dendrogram": sc.tl.dendrogram, "dpt": sc.tl.dpt, "paga": sc.tl.paga, "ingest": sc.tl.ingest, "rank_genes_groups": sc.tl.rank_genes_groups, "filter_rank_genes_groups": sc.tl.filter_rank_genes_groups, "marker_gene_overlap": sc.tl.marker_gene_overlap, "score_genes": sc.tl.score_genes, "score_genes_cell_cycle": sc.tl.score_genes_cell_cycle, }
- src/scmcp/tool/tl.py:89-94 (registration)MCP Tool object registration for 'rank_genes_groups', specifying name, description, and input schema.# Add rank_genes_groups tool rank_genes_groups_tool = types.Tool( name="rank_genes_groups", description="Rank genes for characterizing groups, perform differentially expressison analysis", inputSchema=RankGenesGroupsModel.model_json_schema(), )
- src/scmcp/tool/tl.py:145-162 (registration)tl_tools dictionary that registers the rank_genes_groups_tool for export and use in MCP server.tl_tools = { "tsne": tsne_tool, "umap": umap_tool, "draw_graph": draw_graph_tool, "diffmap": diffmap_tool, "embedding_density": embedding_density_tool, "leiden": leiden_tool, "louvain": louvain_tool, "dendrogram": dendrogram_tool, "dpt": dpt_tool, "paga": paga_tool, "ingest": ingest_tool, "rank_genes_groups": rank_genes_groups_tool, "filter_rank_genes_groups": filter_rank_genes_groups_tool, "marker_gene_overlap": marker_gene_overlap_tool, "score_genes": score_genes_tool, "score_genes_cell_cycle": score_genes_cell_cycle_tool, }
- src/scmcp/schema/tl.py:607-687 (schema)Pydantic model defining the input schema and validation for the rank_genes_groups tool.class RankGenesGroupsModel(JSONParsingModel): """Input schema for the rank_genes_groups tool.""" groupby: str = Field( ..., # Required field description="The key of the observations grouping to consider." ) mask_var: Optional[Union[str, List[bool]]] = Field( default=None, description="Select subset of genes to use in statistical tests." ) use_raw: Optional[bool] = Field( default=None, description="Use raw attribute of adata if present." ) groups: Union[Literal['all'], List[str]] = Field( default='all', description="Subset of groups to which comparison shall be restricted, or 'all' for all groups." ) reference: str = Field( default='rest', description="If 'rest', compare each group to the union of the rest of the group. If a group identifier, compare with respect to this group." ) n_genes: Optional[int] = Field( default=None, description="The number of genes that appear in the returned tables. Defaults to all genes.", gt=0 ) rankby_abs: bool = Field( default=False, description="Rank genes by the absolute value of the score, not by the score." ) pts: bool = Field( default=False, description="Compute the fraction of cells expressing the genes." ) key_added: Optional[str] = Field( default=None, description="The key in adata.uns information is saved to." ) method: Optional[str] = Field( default=None, description="Method for differential expression analysis. Default is 't-test'." ) corr_method: str = Field( default='benjamini-hochberg', description="p-value correction method. Used only for 't-test', 't-test_overestim_var', and 'wilcoxon'." ) tie_correct: bool = Field( default=False, description="Use tie correction for 'wilcoxon' scores. Used only for 'wilcoxon'." ) layer: Optional[str] = Field( default=None, description="Key from adata.layers whose value will be used to perform tests on." ) @field_validator('method') def validate_method(cls, v: Optional[str]) -> Optional[str]: """Validate method is supported""" if v is not None: valid_methods = ['t-test', 't-test_overestim_var', 'wilcoxon', 'logreg'] if v not in valid_methods: raise ValueError(f"method must be one of {valid_methods}") return v @field_validator('corr_method') def validate_corr_method(cls, v: str) -> str: """Validate correction method is supported""" valid_methods = ['benjamini-hochberg', 'bonferroni'] if v not in valid_methods: raise ValueError(f"corr_method must be one of {valid_methods}") return v @field_validator('n_genes') def validate_n_genes(cls, v: Optional[int]) -> Optional[int]: """Validate n_genes is positive""" if v is not None and v <= 0: raise ValueError("n_genes must be a positive integer") return v