pca
Perform principal component analysis on single-cell RNA sequencing data to reduce dimensionality, extract key patterns, and enable efficient visualization and analysis.
Instructions
Principal component analysis
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| chunk_size | No | Number of observations to include in each chunk. | |
| chunked | No | If True, perform an incremental PCA on segments. | |
| dtype | No | Numpy data type string for the result. | float32 |
| layer | No | If provided, which element of layers to use for PCA. | |
| mask_var | No | Boolean mask or string referring to var column for subsetting genes. | |
| n_comps | No | Number of principal components to compute. Defaults to 50 or 1 - minimum dimension size. | |
| svd_solver | No | SVD solver to use. | |
| zero_center | No | If True, compute standard PCA from covariance matrix. |
Implementation Reference
- src/scmcp/tool/pp.py:120-137 (handler)Handler function run_pp_func that executes sc.pp.pca (via pp_func['pca']) for the 'pca' tool, handling arguments and logging.def run_pp_func(ads, func, arguments): adata = ads.adata_dic[ads.active] if func not in pp_func: raise ValueError(f"不支持的函数: {func}") run_func = pp_func[func] parameters = inspect.signature(run_func).parameters arguments["inplace"] = True kwargs = {k: arguments.get(k) for k in parameters if k in arguments} try: res = run_func(adata, **kwargs) add_op_log(adata, run_func, kwargs) except KeyError as e: raise KeyError(f"Can not foud {e} column in adata.obs or adata.var") except Exception as e: raise e return res
- src/scmcp/schema/pp.py:163-218 (schema)Pydantic input schema PCAModel for the 'pca' tool, validating parameters like n_comps, layer, zero_center, etc.class PCAModel(JSONParsingModel): """Input schema for the PCA preprocessing tool.""" n_comps: Optional[int] = Field( default=None, description="Number of principal components to compute. Defaults to 50 or 1 - minimum dimension size.", gt=0 ) layer: Optional[str] = Field( default=None, description="If provided, which element of layers to use for PCA." ) zero_center: Optional[bool] = Field( default=True, description="If True, compute standard PCA from covariance matrix." ) svd_solver: Optional[Literal["arpack", "randomized", "auto", "lobpcg", "tsqr"]] = Field( default=None, description="SVD solver to use." ) mask_var: Optional[Union[str, bool]] = Field( default=None, description="Boolean mask or string referring to var column for subsetting genes." ) dtype: str = Field( default="float32", description="Numpy data type string for the result." ) chunked: bool = Field( default=False, description="If True, perform an incremental PCA on segments." ) chunk_size: Optional[int] = Field( default=None, description="Number of observations to include in each chunk.", gt=0 ) @field_validator('n_comps', 'chunk_size') def validate_positive_integers(cls, v: Optional[int]) -> Optional[int]: """Validate positive integers""" if v is not None and v <= 0: raise ValueError("must be a positive integer") return v @field_validator('dtype') def validate_dtype(cls, v: str) -> str: """Validate numpy dtype""" if v not in ["float32", "float64"]: raise ValueError("dtype must be either 'float32' or 'float64'") return v
- src/scmcp/tool/pp.py:45-49 (registration)Registration of the 'pca' tool using mcp.types.Tool with name, description, and PCAModel schema.pca = types.Tool( name="pca", description="Principal component analysis", inputSchema=PCAModel.model_json_schema(), )
- src/scmcp/tool/pp.py:88-101 (helper)Dictionary mapping tool names to Scanpy functions; 'pca' maps to sc.pp.pca used by run_pp_func.pp_func = { "filter_genes": sc.pp.filter_genes, "filter_cells": sc.pp.filter_cells, "calculate_qc_metrics": partial(sc.pp.calculate_qc_metrics, inplace=True), "log1p": sc.pp.log1p, "normalize_total": sc.pp.normalize_total, "pca": sc.pp.pca, "highly_variable_genes": sc.pp.highly_variable_genes, "regress_out": sc.pp.regress_out, "scale": sc.pp.scale, "combat": sc.pp.combat, "scrublet": sc.pp.scrublet, "neighbors": sc.pp.neighbors, }
- src/scmcp/tool/pp.py:104-117 (registration)pp_tools dictionary including the 'pca' tool object, imported and used for MCP tool registration.pp_tools = { "filter_genes": filter_genes, "filter_cells": filter_cells, "calculate_qc_metrics": calculate_qc_metrics, "log1p": log1p, "normalize_total": normalize_total, "pca": pca, "highly_variable_genes": highly_variable_genes, "regress_out": regress_out, "scale": scale, "combat": combat, "scrublet": scrublet, "neighbors": neighbors, }