mark_var
Identify genes meeting specific conditions like mitochondrial, ribosomal, or hemoglobin patterns and store boolean results in adata.var for quality control analysis.
Instructions
Determine if each gene meets specific conditions and store results in adata.var as boolean values.for example: mitochondrion genes startswith MT-.the tool should be call first when calculate quality control metrics for mitochondrion, ribosomal, harhemoglobin genes. or other qc_vars
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| var_name | No | Column name that will be added to adata.var, do not set if user does not ask | |
| pattern_type | No | Pattern matching type (startswith/endswith/contains), it should be None when gene_class is not None | |
| patterns | No | gene pattern to match, must be a string, it should be None when gene_class is not None | |
| gene_class | No | Gene class type (Mitochondrion/Ribosomal/Hemoglobin) |
Implementation Reference
- src/scmcp/tool/util.py:47-69 (handler)The handler function implementing the core logic of the 'mark_var' tool. It adds boolean columns to adata.var for specific gene classes (mt, ribo, hb) or based on custom patterns.def mark_var(adata, var_name: str = None, gene_class: str = None, pattern_type: str = None, patterns: str = None): if gene_class is not None: if gene_class == "mitochondrion": adata.var["mt"] = adata.var_names.str.startswith(('MT-', 'Mt','mt-')) var_name = "mt" elif gene_class == "ribosomal": adata.var["ribo"] = adata.var_names.str.startswith(("RPS", "RPL")) var_name = "ribo" elif gene_class == "hemoglobin": adata.var["hb"] = adata.var_names.str.contains("^HB[^(P)]", case=False) var_name = "hb" if pattern_type is not None and patterns is not None: if pattern_type == "startswith": adata.var[var_name] = adata.var_names.str.startswith(patterns) elif pattern_type == "endswith": adata.var[var_name] = adata.var_names.str.endswith(patterns) elif pattern_type == "contains": adata.var[var_name] = adata.var_names.str.contains(patterns) else: raise ValueError(f"Did not support pattern_type: {pattern_type}") return {var_name: adata.var[var_name].value_counts().to_dict(), "msg": f"add '{var_name}' column in adata.var"}
- src/scmcp/schema/util.py:15-35 (schema)Pydantic model (MarkVarModel) providing input schema validation for the mark_var tool parameters.class MarkVarModel(JSONParsingModel): """Determine or mark if each gene meets specific conditions and store results in adata.var as boolean values""" var_name: str = Field( default=None, description="Column name that will be added to adata.var, do not set if user does not ask" ) pattern_type: Optional[Literal["startswith", "endswith", "contains"]] = Field( default=None, description="Pattern matching type (startswith/endswith/contains), it should be None when gene_class is not None" ) patterns: str = Field( default=None, description="gene pattern to match, must be a string, it should be None when gene_class is not None" ) gene_class: Optional[Literal["mitochondrion", "ribosomal", "hemoglobin"]] = Field( default=None, description="Gene class type (Mitochondrion/Ribosomal/Hemoglobin)" )
- src/scmcp/tool/util.py:11-19 (registration)MCP Tool object creation for 'mark_var', which is included in util_tools dict and exposed via server.list_tools().mark_var_tool = types.Tool( name="mark_var", description=( "Determine if each gene meets specific conditions and store results in adata.var as boolean values." "for example: mitochondrion genes startswith MT-." "the tool should be call first when calculate quality control metrics for mitochondrion, ribosomal, harhemoglobin genes. or other qc_vars" ), inputSchema=MarkVarModel.model_json_schema(), )
- src/scmcp/server.py:35-56 (registration)Server's list_tools handler that includes util_tools.values() (containing mark_var_tool) when MODULE=='util' or 'all'.@server.list_tools() async def list_tools() -> list[types.Tool]: if MODULE == "io": tools = io_tools.values() elif MODULE == "pp": tools = pp_tools.values() elif MODULE == "tl": tools = tl_tools.values() elif MODULE == "pl": tools = pl_tools.values() elif MODULE == "util": tools = util_tools.values() else: tools = [ *io_tools.values(), *pp_tools.values(), *tl_tools.values(), *pl_tools.values(), *util_tools.values(), *ccc_tools.values(), ] return tools
- src/scmcp/server.py:59-92 (registration)Server's call_tool handler that dispatches to run_util_func for tools in util_tools, invoking the mark_var handler.@server.call_tool() async def call_tool( name: str, arguments ): try: logger.info(f"Running {name} with {arguments}") if name in io_tools.keys(): res = run_io_func(ads, name, arguments) elif name in pp_tools.keys(): res = run_pp_func(ads, name, arguments) elif name in tl_tools.keys(): res = run_tl_func(ads, name, arguments) elif name in pl_tools.keys(): res = run_pl_func(ads, name, arguments) elif name in util_tools.keys(): res = run_util_func(ads, name, arguments) elif name in ccc_tools.keys(): res = run_ccc_func(ads.adata_dic[ads.active], name, arguments) output = str(res) if res is not None else str(ads.adata_dic[ads.active]) return [ types.TextContent( type="text", text=str({"output": output}) ) ] except Exception as error: logger.error(f"{name} with {error}") return [ types.TextContent( type="text", text=str({"Error": error}) ) ]