Skip to main content
Glama

normalize_total

Normalizes single-cell RNA sequencing data by adjusting counts per cell to a consistent total, enabling accurate comparison across cells in analysis workflows.

Instructions

Normalize counts per cell to the same total count

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
target_sumNoIf None, after normalization, each cell has a total count equal to the median of total counts before normalization. If a number is provided, each cell will have this total count after normalization.
exclude_highly_expressedNoExclude highly expressed genes for the computation of the normalization factor for each cell.
max_fractionNoIf exclude_highly_expressed=True, consider cells as highly expressed that have more counts than max_fraction of the original total counts in at least one cell.
key_addedNoName of the field in adata.obs where the normalization factor is stored.
layerNoLayer to normalize instead of X. If None, X is normalized.
layersNoList of layers to normalize. If 'all', normalize all layers.
layer_normNoSpecifies how to normalize layers.
inplaceNoWhether to update adata or return dictionary with normalized copies.

Implementation Reference

  • The handler function that executes the normalize_total tool (and other pp tools) by retrieving the corresponding scanpy function from pp_func and calling it on the active AnnData object with validated arguments and inplace=True.
    def run_pp_func(ads, func, arguments):
        adata = ads.adata_dic[ads.active]
        if func not in pp_func:
            raise ValueError(f"不支持的函数: {func}")
        
        run_func = pp_func[func]
        parameters = inspect.signature(run_func).parameters
        arguments["inplace"] = True
        kwargs = {k: arguments.get(k) for k in parameters if k in arguments}
        try:
            res = run_func(adata, **kwargs)
            add_op_log(adata, run_func, kwargs)
        except KeyError as e:
            raise KeyError(f"Can not foud {e} column in adata.obs or adata.var")
        except Exception as e:
           raise e
        return res
  • Registers the normalize_total tool with MCP types.Tool, specifying name, description, and input schema from NormalizeTotalModel.
    normalize_total = types.Tool(
        name="normalize_total",
        description="Normalize counts per cell to the same total count",
        inputSchema=NormalizeTotalModel.model_json_schema(),
    )
  • Dictionary mapping tool names to their corresponding scanpy preprocessing functions. The 'normalize_total' key maps to sc.pp.normalize_total, used by the handler.
    pp_func = {
        "filter_genes": sc.pp.filter_genes,
        "filter_cells": sc.pp.filter_cells,
        "calculate_qc_metrics": partial(sc.pp.calculate_qc_metrics, inplace=True),
        "log1p": sc.pp.log1p,
        "normalize_total": sc.pp.normalize_total,
        "pca": sc.pp.pca,
        "highly_variable_genes": sc.pp.highly_variable_genes,
        "regress_out": sc.pp.regress_out,
        "scale": sc.pp.scale,
        "combat": sc.pp.combat,
        "scrublet": sc.pp.scrublet,
        "neighbors": sc.pp.neighbors,
    }
  • Pydantic-based input schema model for the normalize_total tool, defining parameters and validators.
    class NormalizeTotalModel(JSONParsingModel):
        """Input schema for the normalize_total preprocessing tool."""
        
        target_sum: Optional[float] = Field(
            default=None,
            description="If None, after normalization, each cell has a total count equal to the median of total counts before normalization. If a number is provided, each cell will have this total count after normalization."
        )
        
        exclude_highly_expressed: bool = Field(
            default=False,
            description="Exclude highly expressed genes for the computation of the normalization factor for each cell."
        )
        
        max_fraction: float = Field(
            default=0.05,
            description="If exclude_highly_expressed=True, consider cells as highly expressed that have more counts than max_fraction of the original total counts in at least one cell.",
            gt=0,
            le=1
        )
        
        key_added: Optional[str] = Field(
            default=None,
            description="Name of the field in adata.obs where the normalization factor is stored."
        )
        
        layer: Optional[str] = Field(
            default=None,
            description="Layer to normalize instead of X. If None, X is normalized."
        )
        
        layers: Optional[Union[Literal['all'], List[str]]] = Field(
            default=None,
            description="List of layers to normalize. If 'all', normalize all layers."
        )
        
        layer_norm: Optional[str] = Field(
            default=None,
            description="Specifies how to normalize layers."
        )
        
        inplace: bool = Field(
            default=True,
            description="Whether to update adata or return dictionary with normalized copies."
        )
        
        @field_validator('target_sum')
        def validate_target_sum(cls, v: Optional[float]) -> Optional[float]:
            """Validate target_sum is positive if provided"""
            if v is not None and v <= 0:
                raise ValueError("target_sum must be positive")
            return v
        
        @field_validator('max_fraction')
        def validate_max_fraction(cls, v: float) -> float:
            """Validate max_fraction is between 0 and 1"""
            if v <= 0 or v > 1:
                raise ValueError("max_fraction must be between 0 and 1")
            return v
  • In the MCP server's call_tool handler, dispatches execution of pp tools (including normalize_total) to the run_pp_func handler.
    elif name in pp_tools.keys():
        res = run_pp_func(ads, name, arguments)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden but offers minimal behavioral context. It mentions the outcome ('same total count') but doesn't disclose that this is a destructive transformation of count data, doesn't explain what 'normalize' means mathematically (scaling factors), and omits critical details like whether it handles sparse matrices or preserves data structure. The description fails to compensate for the lack of annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that states the core function without unnecessary words. It's appropriately sized for a tool with well-documented parameters in the schema, and every word earns its place by conveying the essential purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (8 parameters, no output schema, no annotations), the description is insufficient. It doesn't explain the mathematical operation, doesn't mention typical inputs/outputs (e.g., operates on AnnData objects), and provides no context about the transformation's impact on downstream analysis. For a preprocessing tool with significant parameters, more contextual information is needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 8 parameters thoroughly. The description adds no parameter-specific information beyond what's in the schema. According to scoring rules, when schema coverage is high (>80%), the baseline is 3 even with no param info in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the purpose as 'Normalize counts per cell to the same total count', which is clear about the operation (normalize) and target (counts per cell). However, it lacks specificity about the domain (single-cell RNA-seq data) and doesn't differentiate from sibling tools like 'scale' or 'log1p' that also perform normalization-like operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., requires raw count data), typical workflow position (early preprocessing), or comparison to other normalization methods available in sibling tools. The agent must infer usage from parameter descriptions alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/huang-sh/scmcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server