marker_gene_overlap
Calculate overlap between data-derived marker genes and reference markers to identify shared biological signatures in single-cell RNA sequencing analysis.
Instructions
Calculate overlap between data-derived marker genes and reference markers
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| key | No | The key in adata.uns where the rank_genes_groups output is stored. | rank_genes_groups |
| method | No | Method to calculate marker gene overlap: 'overlap_count', 'overlap_coef', or 'jaccard'. | overlap_count |
| normalize | No | Normalization option for the marker gene overlap output. Only applicable when method is 'overlap_count'. | |
| top_n_markers | No | The number of top data-derived marker genes to use. By default the top 100 marker genes are used. | |
| adj_pval_threshold | No | A significance threshold on the adjusted p-values to select marker genes. | |
| key_added | No | Name of the .uns field that will contain the marker overlap scores. | marker_gene_overlap |
Implementation Reference
- src/scmcp/tool/tl.py:164-177 (handler)Handler function that executes the marker_gene_overlap tool (and other tl tools) by retrieving the corresponding scanpy function from tl_func mapping and calling it on the active AnnData object with validated arguments.def run_tl_func(ads, func, arguments): adata = ads.adata_dic[ads.active] if func not in tl_func: raise ValueError(f"Unsupported function: {func}") run_func = tl_func[func] parameters = inspect.signature(run_func).parameters kwargs = {k: arguments.get(k) for k in parameters if k in arguments} try: res = run_func(adata, **kwargs) add_op_log(adata, run_func, kwargs) except Exception as e: logger.error(f"Error running function {func}: {e}") raise return
- src/scmcp/schema/tl.py:752-821 (schema)Pydantic model defining the input schema and validators for the marker_gene_overlap tool parameters.class MarkerGeneOverlapModel(JSONParsingModel): """Input schema for the marker gene overlap tool.""" key: str = Field( default='rank_genes_groups', description="The key in adata.uns where the rank_genes_groups output is stored." ) method: str = Field( default='overlap_count', description="Method to calculate marker gene overlap: 'overlap_count', 'overlap_coef', or 'jaccard'." ) normalize: Optional[Literal['reference', 'data']] = Field( default=None, description="Normalization option for the marker gene overlap output. Only applicable when method is 'overlap_count'." ) top_n_markers: Optional[int] = Field( default=None, description="The number of top data-derived marker genes to use. By default the top 100 marker genes are used.", gt=0 ) adj_pval_threshold: Optional[float] = Field( default=None, description="A significance threshold on the adjusted p-values to select marker genes.", gt=0, le=1.0 ) key_added: str = Field( default='marker_gene_overlap', description="Name of the .uns field that will contain the marker overlap scores." ) @field_validator('method') def validate_method(cls, v: str) -> str: """Validate method is supported""" valid_methods = ['overlap_count', 'overlap_coef', 'jaccard'] if v not in valid_methods: raise ValueError(f"method must be one of {valid_methods}") return v @field_validator('normalize') def validate_normalize(cls, v: Optional[str], info: ValidationInfo) -> Optional[str]: """Validate normalize is only used with overlap_count method""" if v is not None: if v not in ['reference', 'data']: raise ValueError("normalize must be either 'reference' or 'data'") values = info.data if 'method' in values and values['method'] != 'overlap_count': raise ValueError("normalize can only be used when method is 'overlap_count'") return v @field_validator('top_n_markers') def validate_top_n_markers(cls, v: Optional[int]) -> Optional[int]: """Validate top_n_markers is positive""" if v is not None and v <= 0: raise ValueError("top_n_markers must be a positive integer") return v @field_validator('adj_pval_threshold') def validate_adj_pval_threshold(cls, v: Optional[float]) -> Optional[float]: """Validate adj_pval_threshold is between 0 and 1""" if v is not None and (v <= 0 or v > 1): raise ValueError("adj_pval_threshold must be between 0 and 1") return v
- src/scmcp/tool/tl.py:103-108 (registration)Registers the marker_gene_overlap tool as an MCP Tool object with name, description, and input schema from MarkerGeneOverlapModel.# Add marker_gene_overlap tool marker_gene_overlap_tool = types.Tool( name="marker_gene_overlap", description="Calculate overlap between data-derived marker genes and reference markers", inputSchema=MarkerGeneOverlapModel.model_json_schema(), )
- src/scmcp/tool/tl.py:124-142 (registration)Maps the tool name 'marker_gene_overlap' to the underlying scanpy.tl.marker_gene_overlap function for execution.# Dictionary mapping tool names to scanpy functions tl_func = { "tsne": sc.tl.tsne, "umap": sc.tl.umap, "draw_graph": sc.tl.draw_graph, "diffmap": sc.tl.diffmap, "embedding_density": sc.tl.embedding_density, "leiden": sc.tl.leiden, "louvain": sc.tl.louvain, "dendrogram": sc.tl.dendrogram, "dpt": sc.tl.dpt, "paga": sc.tl.paga, "ingest": sc.tl.ingest, "rank_genes_groups": sc.tl.rank_genes_groups, "filter_rank_genes_groups": sc.tl.filter_rank_genes_groups, "marker_gene_overlap": sc.tl.marker_gene_overlap, "score_genes": sc.tl.score_genes, "score_genes_cell_cycle": sc.tl.score_genes_cell_cycle, }
- src/scmcp/tool/tl.py:144-162 (registration)Adds the marker_gene_overlap_tool to the dictionary of tl tools, which is used by the server for listing and dispatching.# Dictionary mapping tool names to tool objects tl_tools = { "tsne": tsne_tool, "umap": umap_tool, "draw_graph": draw_graph_tool, "diffmap": diffmap_tool, "embedding_density": embedding_density_tool, "leiden": leiden_tool, "louvain": louvain_tool, "dendrogram": dendrogram_tool, "dpt": dpt_tool, "paga": paga_tool, "ingest": ingest_tool, "rank_genes_groups": rank_genes_groups_tool, "filter_rank_genes_groups": filter_rank_genes_groups_tool, "marker_gene_overlap": marker_gene_overlap_tool, "score_genes": score_genes_tool, "score_genes_cell_cycle": score_genes_cell_cycle_tool, }