calculate_qc_metrics
Calculate quality control metrics for AnnData, including total counts, gene numbers, and percentages of ribosomal and mitochondrial counts, to evaluate single-cell RNA sequencing data.
Instructions
Calculate quality control metrics(common metrics: total counts, gene number, percentage of counts in ribosomal and mitochondrial) for AnnData.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| expr_type | No | Name of kind of values in X. | counts |
| layer | No | If provided, use adata.layers[layer] for expression values instead of adata.X | |
| log1p | No | Set to False to skip computing log1p transformed annotations. | |
| percent_top | No | List of ranks (where genes are ranked by expression) at which the cumulative proportion of expression will be reported as a percentage. | |
| qc_vars | No | Keys for boolean columns of .var which identify variables you could want to control for mark_var tool should be called frist when you want to calculate mt, ribo, hb, and check tool output for var columns | |
| use_raw | No | If True, use adata.raw.X for expression values instead of adata.X | |
| var_type | No | The kind of thing the variables are. | genes |
Implementation Reference
- src/scmcp/tool/pp.py:120-136 (handler)Shared handler function that executes the calculate_qc_metrics tool by dispatching to the mapped scanpy.pp.calculate_qc_metrics function on the active AnnData.def run_pp_func(ads, func, arguments): adata = ads.adata_dic[ads.active] if func not in pp_func: raise ValueError(f"不支持的函数: {func}") run_func = pp_func[func] parameters = inspect.signature(run_func).parameters arguments["inplace"] = True kwargs = {k: arguments.get(k) for k in parameters if k in arguments} try: res = run_func(adata, **kwargs) add_op_log(adata, run_func, kwargs) except KeyError as e: raise KeyError(f"Can not foud {e} column in adata.obs or adata.var") except Exception as e: raise e return res
- src/scmcp/schema/pp.py:76-126 (schema)Input schema definition using Pydantic model CalculateQCMetrics with fields for QC metrics calculation parameters and validation.class CalculateQCMetrics(JSONParsingModel): """Input schema for the calculate_qc_metrics preprocessing tool.""" expr_type: str = Field( default="counts", description="Name of kind of values in X." ) var_type: str = Field( default="genes", description="The kind of thing the variables are." ) qc_vars: Optional[Union[List[str], str]] = Field( default=[], description=( "Keys for boolean columns of .var which identify variables you could want to control for " "mark_var tool should be called frist when you want to calculate mt, ribo, hb, and check tool output for var columns" ) ) percent_top: Optional[List[int]] = Field( default=[50, 100, 200, 500], description="List of ranks (where genes are ranked by expression) at which the cumulative proportion of expression will be reported as a percentage." ) layer: Optional[str] = Field( default=None, description="If provided, use adata.layers[layer] for expression values instead of adata.X" ) use_raw: bool = Field( default=False, description="If True, use adata.raw.X for expression values instead of adata.X" ) log1p: bool = Field( default=True, description="Set to False to skip computing log1p transformed annotations." ) @field_validator('percent_top') def validate_percent_top(cls, v: Optional[List[int]]) -> Optional[List[int]]: """验证 percent_top 中的值为正整数""" if v is not None: for rank in v: if not isinstance(rank, int) or rank <= 0: raise ValueError("percent_top 中的所有值必须是正整数") return v
- src/scmcp/tool/pp.py:27-31 (registration)Registration/definition of the MCP tool 'calculate_qc_metrics' with name, description, and schema reference.calculate_qc_metrics = types.Tool( name="calculate_qc_metrics", description="Calculate quality control metrics(common metrics: total counts, gene number, percentage of counts in ribosomal and mitochondrial) for AnnData.", inputSchema=CalculateQCMetrics.model_json_schema(), )
- src/scmcp/tool/pp.py:88-101 (helper)Mapping dictionary pp_func that associates 'calculate_qc_metrics' to functools.partial(scanpy.pp.calculate_qc_metrics, inplace=True), used by the handler.pp_func = { "filter_genes": sc.pp.filter_genes, "filter_cells": sc.pp.filter_cells, "calculate_qc_metrics": partial(sc.pp.calculate_qc_metrics, inplace=True), "log1p": sc.pp.log1p, "normalize_total": sc.pp.normalize_total, "pca": sc.pp.pca, "highly_variable_genes": sc.pp.highly_variable_genes, "regress_out": sc.pp.regress_out, "scale": sc.pp.scale, "combat": sc.pp.combat, "scrublet": sc.pp.scrublet, "neighbors": sc.pp.neighbors, }
- src/scmcp/tool/pp.py:107-107 (helper)Inclusion of the calculate_qc_metrics tool in the pp_tools dictionary for collection with other tools."calculate_qc_metrics": calculate_qc_metrics,