Skip to main content
Glama

ingest

Map labels and embeddings from single-cell RNA sequencing reference data to new datasets using k-nearest neighbors and embedding methods for biological analysis.

Instructions

Map labels and embeddings from reference data to new data

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
obsNoLabels' keys in adata_ref.obs which need to be mapped to adata.obs (inferred for observation of adata).
embedding_methodNoEmbeddings in adata_ref which need to be mapped to adata. The only supported values are 'umap' and 'pca'.
labeling_methodNoThe method to map labels in adata_ref.obs to adata.obs. The only supported value is 'knn'.knn
neighbors_keyNoIf specified, ingest looks adata_ref.uns[neighbors_key] for neighbors settings and uses the corresponding distances.

Implementation Reference

  • Handler function that executes the 'ingest' tool by calling sc.tl.ingest with validated arguments on the active AnnData object.
    def run_tl_func(ads, func, arguments):
        adata = ads.adata_dic[ads.active]
        if func not in tl_func:
            raise ValueError(f"Unsupported function: {func}")
        run_func = tl_func[func]
        parameters = inspect.signature(run_func).parameters
        kwargs = {k: arguments.get(k) for k in parameters if k in arguments}    
        try:
            res = run_func(adata, **kwargs)
            add_op_log(adata, run_func, kwargs)
        except Exception as e:
            logger.error(f"Error running function {func}: {e}")
            raise
        return 
  • Pydantic model defining the input schema and validation for the 'ingest' tool parameters.
    class IngestModel(JSONParsingModel):
        """Input schema for the ingest tool that maps labels and embeddings from reference data to new data."""
        
        obs: Optional[Union[str, List[str]]] = Field(
            default=None,
            description="Labels' keys in adata_ref.obs which need to be mapped to adata.obs (inferred for observation of adata)."
        )
        
        embedding_method: Union[str, List[str]] = Field(
            default=['umap', 'pca'],
            description="Embeddings in adata_ref which need to be mapped to adata. The only supported values are 'umap' and 'pca'."
        )
        
        labeling_method: str = Field(
            default='knn',
            description="The method to map labels in adata_ref.obs to adata.obs. The only supported value is 'knn'."
        )
        
        neighbors_key: Optional[str] = Field(
            default=None,
            description="If specified, ingest looks adata_ref.uns[neighbors_key] for neighbors settings and uses the corresponding distances."
        )
        
        @field_validator('embedding_method')
        def validate_embedding_method(cls, v: Union[str, List[str]]) -> Union[str, List[str]]:
            """Validate embedding method is supported"""
            valid_methods = ['umap', 'pca']
            
            if isinstance(v, str):
                if v.lower() not in valid_methods:
                    raise ValueError(f"embedding_method must be one of {valid_methods}")
                return v.lower()
            
            elif isinstance(v, list):
                for method in v:
                    if method.lower() not in valid_methods:
                        raise ValueError(f"embedding_method must contain only values from {valid_methods}")
                return [method.lower() for method in v]
            
            return v
        
        @field_validator('labeling_method')
        def validate_labeling_method(cls, v: str) -> str:
            """Validate labeling method is supported"""
            if v.lower() != 'knn':
                raise ValueError("labeling_method must be 'knn'")
            return v.lower()
  • Registers the 'ingest' tool object with MCP types.Tool, including schema reference.
    # Add ingest tool
    ingest_tool = types.Tool(
        name="ingest",
        description="Map labels and embeddings from reference data to new data",
        inputSchema=IngestModel.model_json_schema(),
    )
  • Maps 'ingest' tool name to the underlying scanpy function sc.tl.ingest for execution.
    tl_func = {
        "tsne": sc.tl.tsne,
        "umap": sc.tl.umap,
        "draw_graph": sc.tl.draw_graph,
        "diffmap": sc.tl.diffmap,
        "embedding_density": sc.tl.embedding_density,
        "leiden": sc.tl.leiden,
        "louvain": sc.tl.louvain,
        "dendrogram": sc.tl.dendrogram,
        "dpt": sc.tl.dpt,
        "paga": sc.tl.paga,
        "ingest": sc.tl.ingest,
        "rank_genes_groups": sc.tl.rank_genes_groups,
        "filter_rank_genes_groups": sc.tl.filter_rank_genes_groups,
        "marker_gene_overlap": sc.tl.marker_gene_overlap,
        "score_genes": sc.tl.score_genes,
        "score_genes_cell_cycle": sc.tl.score_genes_cell_cycle,
    }
  • In the MCP server call_tool handler, dispatches 'ingest' (as part of tl_tools) to run_tl_func.
    elif name in tl_tools.keys():
        res = run_tl_func(ads, name, arguments) 

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/huang-sh/scmcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server