filter_genes
Filter genes in single-cell RNA sequencing data by setting thresholds for minimum/maximum cell counts or expression levels to refine analysis inputs.
Instructions
Filter genes based on number of cells or counts
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| max_cells | No | Maximum number of cells expressed required for a gene to pass filtering. | |
| max_counts | No | Maximum number of counts required for a gene to pass filtering. | |
| min_cells | No | Minimum number of cells expressed required for a gene to pass filtering. | |
| min_counts | No | Minimum number of counts required for a gene to pass filtering. |
Implementation Reference
- src/scmcp/schema/pp.py:46-75 (schema)Pydantic model defining the input schema for the filter_genes tool, including parameters like min_cells, min_counts, etc., with validation.class FilterGenes(JSONParsingModel): """Input schema for the filter_genes preprocessing tool.""" min_counts: Optional[int] = Field( default=None, description="Minimum number of counts required for a gene to pass filtering." ) min_cells: Optional[int] = Field( default=None, description="Minimum number of cells expressed required for a gene to pass filtering." ) max_counts: Optional[int] = Field( default=None, description="Maximum number of counts required for a gene to pass filtering." ) max_cells: Optional[int] = Field( default=None, description="Maximum number of cells expressed required for a gene to pass filtering." ) @field_validator('min_counts', 'min_cells', 'max_counts', 'max_cells') def validate_positive_integers(cls, v: Optional[int]) -> Optional[int]: """验证整数参数为正数""" if v is not None and v <= 0: raise ValueError("must be positive_integers") return v
- src/scmcp/tool/pp.py:21-25 (registration)Registers the filter_genes tool using mcp.types.Tool with name, description, and schema reference.filter_genes = types.Tool( name="filter_genes", description="Filter genes based on number of cells or counts", inputSchema=FilterGenes.model_json_schema(), )
- src/scmcp/tool/pp.py:88-101 (handler)Maps the 'filter_genes' tool name to scanpy's sc.pp.filter_genes function, which is used by the handler to execute the logic.pp_func = { "filter_genes": sc.pp.filter_genes, "filter_cells": sc.pp.filter_cells, "calculate_qc_metrics": partial(sc.pp.calculate_qc_metrics, inplace=True), "log1p": sc.pp.log1p, "normalize_total": sc.pp.normalize_total, "pca": sc.pp.pca, "highly_variable_genes": sc.pp.highly_variable_genes, "regress_out": sc.pp.regress_out, "scale": sc.pp.scale, "combat": sc.pp.combat, "scrublet": sc.pp.scrublet, "neighbors": sc.pp.neighbors, }
- src/scmcp/tool/pp.py:120-136 (handler)The run_pp_func executes the preprocessing tools, including filter_genes, by dispatching to the mapped scanpy function, handling arguments, inplace=True, logging operations, and error handling.def run_pp_func(ads, func, arguments): adata = ads.adata_dic[ads.active] if func not in pp_func: raise ValueError(f"不支持的函数: {func}") run_func = pp_func[func] parameters = inspect.signature(run_func).parameters arguments["inplace"] = True kwargs = {k: arguments.get(k) for k in parameters if k in arguments} try: res = run_func(adata, **kwargs) add_op_log(adata, run_func, kwargs) except KeyError as e: raise KeyError(f"Can not foud {e} column in adata.obs or adata.var") except Exception as e: raise e return res
- src/scmcp/tool/pp.py:105-105 (registration)Adds the filter_genes Tool object to the pp_tools dictionary, which is imported and likely used for server registration."filter_genes": filter_genes,