ingest
Map labels and embeddings from single-cell RNA sequencing reference data to new datasets using k-nearest neighbors and embedding methods for biological analysis.
Instructions
Map labels and embeddings from reference data to new data
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| obs | No | Labels' keys in adata_ref.obs which need to be mapped to adata.obs (inferred for observation of adata). | |
| embedding_method | No | Embeddings in adata_ref which need to be mapped to adata. The only supported values are 'umap' and 'pca'. | |
| labeling_method | No | The method to map labels in adata_ref.obs to adata.obs. The only supported value is 'knn'. | knn |
| neighbors_key | No | If specified, ingest looks adata_ref.uns[neighbors_key] for neighbors settings and uses the corresponding distances. |
Implementation Reference
- src/scmcp/tool/tl.py:164-177 (handler)Handler function that executes the 'ingest' tool by calling sc.tl.ingest with validated arguments on the active AnnData object.def run_tl_func(ads, func, arguments): adata = ads.adata_dic[ads.active] if func not in tl_func: raise ValueError(f"Unsupported function: {func}") run_func = tl_func[func] parameters = inspect.signature(run_func).parameters kwargs = {k: arguments.get(k) for k in parameters if k in arguments} try: res = run_func(adata, **kwargs) add_op_log(adata, run_func, kwargs) except Exception as e: logger.error(f"Error running function {func}: {e}") raise return
- src/scmcp/schema/tl.py:558-605 (schema)Pydantic model defining the input schema and validation for the 'ingest' tool parameters.class IngestModel(JSONParsingModel): """Input schema for the ingest tool that maps labels and embeddings from reference data to new data.""" obs: Optional[Union[str, List[str]]] = Field( default=None, description="Labels' keys in adata_ref.obs which need to be mapped to adata.obs (inferred for observation of adata)." ) embedding_method: Union[str, List[str]] = Field( default=['umap', 'pca'], description="Embeddings in adata_ref which need to be mapped to adata. The only supported values are 'umap' and 'pca'." ) labeling_method: str = Field( default='knn', description="The method to map labels in adata_ref.obs to adata.obs. The only supported value is 'knn'." ) neighbors_key: Optional[str] = Field( default=None, description="If specified, ingest looks adata_ref.uns[neighbors_key] for neighbors settings and uses the corresponding distances." ) @field_validator('embedding_method') def validate_embedding_method(cls, v: Union[str, List[str]]) -> Union[str, List[str]]: """Validate embedding method is supported""" valid_methods = ['umap', 'pca'] if isinstance(v, str): if v.lower() not in valid_methods: raise ValueError(f"embedding_method must be one of {valid_methods}") return v.lower() elif isinstance(v, list): for method in v: if method.lower() not in valid_methods: raise ValueError(f"embedding_method must contain only values from {valid_methods}") return [method.lower() for method in v] return v @field_validator('labeling_method') def validate_labeling_method(cls, v: str) -> str: """Validate labeling method is supported""" if v.lower() != 'knn': raise ValueError("labeling_method must be 'knn'") return v.lower()
- src/scmcp/tool/tl.py:82-87 (registration)Registers the 'ingest' tool object with MCP types.Tool, including schema reference.# Add ingest tool ingest_tool = types.Tool( name="ingest", description="Map labels and embeddings from reference data to new data", inputSchema=IngestModel.model_json_schema(), )
- src/scmcp/tool/tl.py:125-142 (registration)Maps 'ingest' tool name to the underlying scanpy function sc.tl.ingest for execution.tl_func = { "tsne": sc.tl.tsne, "umap": sc.tl.umap, "draw_graph": sc.tl.draw_graph, "diffmap": sc.tl.diffmap, "embedding_density": sc.tl.embedding_density, "leiden": sc.tl.leiden, "louvain": sc.tl.louvain, "dendrogram": sc.tl.dendrogram, "dpt": sc.tl.dpt, "paga": sc.tl.paga, "ingest": sc.tl.ingest, "rank_genes_groups": sc.tl.rank_genes_groups, "filter_rank_genes_groups": sc.tl.filter_rank_genes_groups, "marker_gene_overlap": sc.tl.marker_gene_overlap, "score_genes": sc.tl.score_genes, "score_genes_cell_cycle": sc.tl.score_genes_cell_cycle, }
- src/scmcp/server.py:69-70 (registration)In the MCP server call_tool handler, dispatches 'ingest' (as part of tl_tools) to run_tl_func.elif name in tl_tools.keys(): res = run_tl_func(ads, name, arguments)