Skip to main content
Glama

pca

Reduce dimensionality of single-cell RNA sequencing data to identify key patterns and simplify analysis for biological insights.

Instructions

Principal component analysis

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
n_compsNoNumber of principal components to compute. Defaults to 50 or 1 - minimum dimension size.
layerNoIf provided, which element of layers to use for PCA.
zero_centerNoIf True, compute standard PCA from covariance matrix.
svd_solverNoSVD solver to use.
mask_varNoBoolean mask or string referring to var column for subsetting genes.
dtypeNoNumpy data type string for the result.float32
chunkedNoIf True, perform an incremental PCA on segments.
chunk_sizeNoNumber of observations to include in each chunk.

Implementation Reference

  • Handler function that executes the 'pca' tool (and other pp tools). It retrieves the active AnnData object, maps 'pca' to sc.pp.pca via pp_func dict, calls it with parsed arguments (forcing inplace=True), logs the operation, and handles errors.
    def run_pp_func(ads, func, arguments):
        adata = ads.adata_dic[ads.active]
        if func not in pp_func:
            raise ValueError(f"不支持的函数: {func}")
        
        run_func = pp_func[func]
        parameters = inspect.signature(run_func).parameters
        arguments["inplace"] = True
        kwargs = {k: arguments.get(k) for k in parameters if k in arguments}
        try:
            res = run_func(adata, **kwargs)
            add_op_log(adata, run_func, kwargs)
        except KeyError as e:
            raise KeyError(f"Can not foud {e} column in adata.obs or adata.var")
        except Exception as e:
           raise e
        return res
  • Pydantic model defining the input schema for the 'pca' tool, including parameters like n_comps, layer, zero_center, svd_solver, etc., with validators.
    class PCAModel(JSONParsingModel):
        """Input schema for the PCA preprocessing tool."""
        
        n_comps: Optional[int] = Field(
            default=None,
            description="Number of principal components to compute. Defaults to 50 or 1 - minimum dimension size.",
            gt=0
        )
        
        layer: Optional[str] = Field(
            default=None,
            description="If provided, which element of layers to use for PCA."
        )
        
        zero_center: Optional[bool] = Field(
            default=True,
            description="If True, compute standard PCA from covariance matrix."
        )
        
        svd_solver: Optional[Literal["arpack", "randomized", "auto", "lobpcg", "tsqr"]] = Field(
            default=None,
            description="SVD solver to use."
        )
        mask_var: Optional[Union[str, bool]] = Field(
            default=None,
            description="Boolean mask or string referring to var column for subsetting genes."
        )
        dtype: str = Field(
            default="float32",
            description="Numpy data type string for the result."
        )
        chunked: bool = Field(
            default=False,
            description="If True, perform an incremental PCA on segments."
        )
        
        chunk_size: Optional[int] = Field(
            default=None,
            description="Number of observations to include in each chunk.",
            gt=0
        )
        
        @field_validator('n_comps', 'chunk_size')
        def validate_positive_integers(cls, v: Optional[int]) -> Optional[int]:
            """Validate positive integers"""
            if v is not None and v <= 0:
                raise ValueError("must be a positive integer")
            return v
        
        @field_validator('dtype')
        def validate_dtype(cls, v: str) -> str:
            """Validate numpy dtype"""
            if v not in ["float32", "float64"]:
                raise ValueError("dtype must be either 'float32' or 'float64'")
            return v
  • MCP Tool object registration for 'pca', specifying name, description, and input schema from PCAModel.
    pca = types.Tool(
        name="pca",
        description="Principal component analysis",
        inputSchema=PCAModel.model_json_schema(),
    )
  • Dictionary mapping tool names to Scanpy functions. 'pca' maps directly to sc.pp.pca, used by run_pp_func to execute the tool.
    pp_func = {
        "filter_genes": sc.pp.filter_genes,
        "filter_cells": sc.pp.filter_cells,
        "calculate_qc_metrics": partial(sc.pp.calculate_qc_metrics, inplace=True),
        "log1p": sc.pp.log1p,
        "normalize_total": sc.pp.normalize_total,
        "pca": sc.pp.pca,
        "highly_variable_genes": sc.pp.highly_variable_genes,
        "regress_out": sc.pp.regress_out,
        "scale": sc.pp.scale,
        "combat": sc.pp.combat,
        "scrublet": sc.pp.scrublet,
        "neighbors": sc.pp.neighbors,
    }
  • Server's list_tools method that returns the registered tools, including pp_tools which contains the 'pca' tool, based on MODULE environment variable.
    @server.list_tools()
    async def list_tools() -> list[types.Tool]:
        if MODULE == "io":
            tools = io_tools.values()
        elif MODULE == "pp":
            tools = pp_tools.values()
        elif MODULE == "tl":
            tools = tl_tools.values()
        elif MODULE == "pl":
            tools = pl_tools.values()
        elif MODULE == "util":
            tools = util_tools.values()
        else:
            tools = [
                *io_tools.values(),
                *pp_tools.values(),
                *tl_tools.values(),
                *pl_tools.values(),
                *util_tools.values(),
                *ccc_tools.values(),
            ]
        return tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/huang-sh/scmcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server