advanced_query
Execute complex genomic queries in batches with parallel or sequential strategies to retrieve gene information and search results from the genome-mcp server.
Instructions
高级批量查询 - 支持复杂查询策略
Args: queries: 查询列表,每个元素包含 {"query": str, "type": str} strategy: 执行策略(parallel/sequential) delay: 查询间隔(秒)
Returns: 批量查询结果
Examples: advanced_query([ {"query": "TP53", "type": "info"}, {"query": "BRCA1", "type": "info"}, {"query": "cancer", "type": "search"} ])
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| queries | Yes | ||
| strategy | No | parallel | |
| delay | No |
Input Schema (JSON Schema)
{
"properties": {
"delay": {
"default": 0.34,
"type": "number"
},
"queries": {
"items": {
"additionalProperties": true,
"type": "object"
},
"type": "array"
},
"strategy": {
"default": "parallel",
"type": "string"
}
},
"required": [
"queries"
],
"type": "object"
}
Implementation Reference
- src/genome_mcp/core/tools.py:352-426 (handler)The main execution handler for the 'advanced_query' MCP tool. Handles batch queries with parallel or sequential strategies using QueryExecutor. Registered via @mcp.tool() decorator within create_mcp_tools function.@mcp.tool() async def advanced_query( queries: list[dict[str, Any]], strategy: str = "parallel", delay: float = 0.34, # NCBI API频率限制 ) -> AdvancedQueryResult: """ 高级批量查询 - 支持复杂查询策略 Args: queries: 查询列表,每个元素包含 {"query": str, "type": str} strategy: 执行策略(parallel/sequential) delay: 查询间隔(秒) Returns: 批量查询结果 Examples: advanced_query([ {"query": "TP53", "type": "info"}, {"query": "BRCA1", "type": "info"}, {"query": "cancer", "type": "search"} ]) """ results = {} async def execute_single_query(index: int, query_dict: dict[str, Any]): try: parsed = QueryParser.parse_by_type( query_dict["query"], query_dict.get("type", "auto") ) result = await _query_executor.execute(parsed, **query_dict) results[index] = result await asyncio.sleep(delay) # 遵守频率限制 except ValidationError as e: results[index] = format_simple_error( e, query=query_dict.get("query", ""), operation="advanced_query" ) except Exception as e: results[index] = format_simple_error( e, query=query_dict.get("query", ""), operation="advanced_query" ) if strategy == "parallel": # 并发查询 await asyncio.gather( *[execute_single_query(i, q) for i, q in enumerate(queries)] ) else: # 顺序查询(适用于依赖查询) for i, query_dict in enumerate(queries): try: parsed = QueryParser.parse_by_type( query_dict["query"], query_dict.get("type", "auto") ) result = await _query_executor.execute(parsed, **query_dict) results[i] = result await asyncio.sleep(delay) # 遵守频率限制 except ValidationError as e: results[i] = format_simple_error( e, query=query_dict.get("query", ""), operation="advanced_query" ) except Exception as e: results[i] = format_simple_error( e, query=query_dict.get("query", ""), operation="advanced_query" ) return { "strategy": strategy, "total_queries": len(queries), "successful": len([r for r in results.values() if "error" not in r]), "results": results, }
- src/genome_mcp/core/types.py:59-66 (schema)TypedDict schema defining the structure of AdvancedQueryResult, the return type of the advanced_query tool.class AdvancedQueryResult(TypedDict): """高级查询结果类型""" strategy: str total_queries: int successful: int results: dict[int, dict[str, Any]]
- src/genome_mcp/core/tools.py:250-694 (registration)The create_mcp_tools function registers all MCP tools including advanced_query via decorators.def create_mcp_tools(mcp: FastMCP) -> None: """创建并注册所有MCP工具""" @mcp.tool() async def get_data( query: str | list[str], query_type: str = "auto", data_type: str = "gene", format: str = "simple", species: str = "human", max_results: int = 20, ) -> ToolResult: """ 智能数据获取接口 - 统一处理所有查询类型 自动识别查询类型: - "TP53" → 基因信息查询 - "P04637" → 蛋白质详细信息查询 - "cancer" → 基因搜索 - "protein kinase" → 蛋白质功能搜索 - "chr17:7565097-7590856" → 区域搜索 - "TP53, BRCA1" → 批量基因信息 - "breast cancer genes" → 智能搜索 - "TP53 homologs" → 同源基因查询 - "evolutionary conservation" → 进化分析查询 Args: query: 查询内容(可以是基因ID、蛋白质ID、搜索词、区域、ID列表、进化相关查询) query_type: 查询类型(auto/info/search/region/protein/gene_protein/ortholog/evolution) data_type: 数据类型(gene/protein/gene_protein/ortholog/evolution) format: 返回格式(simple/detailed/raw) species: 物种(默认:human,支持9606/human/mouse/rat等) max_results: 最大结果数(默认:20) Returns: 查询结果字典,包含基因和/或蛋白质信息 Examples: # 基因信息查询 get_data("TP53") get_data("TP53", format="detailed") # 批量查询 get_data(["TP53", "BRCA1", "BRCA2"]) # 区域搜索 get_data("chr17:7565097-7590856") # 蛋白质查询 get_data("P04637", data_type="protein") # 基因-蛋白质整合查询 get_data("TP53", data_type="gene_protein") # 蛋白质功能搜索 get_data("tumor suppressor", data_type="protein") """ try: # 验证通用参数 validated_max_results, validated_species, validated_query_type = ( validate_common_params( max_results=max_results, species=species, query_type=query_type ) ) # 根据data_type参数调整查询类型 if data_type == "protein" and validated_query_type == "auto": validated_query_type = "protein" elif data_type == "gene_protein" and validated_query_type == "auto": validated_query_type = "gene_protein" elif data_type == "ortholog" and validated_query_type == "auto": validated_query_type = "ortholog" elif data_type == "evolution" and validated_query_type == "auto": validated_query_type = "evolution" elif data_type == "gene" and validated_query_type == "auto": validated_query_type = "auto" # 保持原有的自动识别 # 解析查询意图 parsed = QueryParser.parse(query, validated_query_type) # 使用验证后的物种信息 if "organism" not in parsed.params: parsed.params["organism"] = validated_species # 执行查询 result = await _query_executor.execute( parsed, max_results=validated_max_results ) # 格式化结果 if format == "simple": return _format_simple_result(result) elif format == "detailed": return result else: return result except ValidationError as e: return format_simple_error(e, query=query, operation="get_data") except Exception as e: return format_simple_error(e, query=query, operation="get_data") @mcp.tool() async def advanced_query( queries: list[dict[str, Any]], strategy: str = "parallel", delay: float = 0.34, # NCBI API频率限制 ) -> AdvancedQueryResult: """ 高级批量查询 - 支持复杂查询策略 Args: queries: 查询列表,每个元素包含 {"query": str, "type": str} strategy: 执行策略(parallel/sequential) delay: 查询间隔(秒) Returns: 批量查询结果 Examples: advanced_query([ {"query": "TP53", "type": "info"}, {"query": "BRCA1", "type": "info"}, {"query": "cancer", "type": "search"} ]) """ results = {} async def execute_single_query(index: int, query_dict: dict[str, Any]): try: parsed = QueryParser.parse_by_type( query_dict["query"], query_dict.get("type", "auto") ) result = await _query_executor.execute(parsed, **query_dict) results[index] = result await asyncio.sleep(delay) # 遵守频率限制 except ValidationError as e: results[index] = format_simple_error( e, query=query_dict.get("query", ""), operation="advanced_query" ) except Exception as e: results[index] = format_simple_error( e, query=query_dict.get("query", ""), operation="advanced_query" ) if strategy == "parallel": # 并发查询 await asyncio.gather( *[execute_single_query(i, q) for i, q in enumerate(queries)] ) else: # 顺序查询(适用于依赖查询) for i, query_dict in enumerate(queries): try: parsed = QueryParser.parse_by_type( query_dict["query"], query_dict.get("type", "auto") ) result = await _query_executor.execute(parsed, **query_dict) results[i] = result await asyncio.sleep(delay) # 遵守频率限制 except ValidationError as e: results[i] = format_simple_error( e, query=query_dict.get("query", ""), operation="advanced_query" ) except Exception as e: results[i] = format_simple_error( e, query=query_dict.get("query", ""), operation="advanced_query" ) return { "strategy": strategy, "total_queries": len(queries), "successful": len([r for r in results.values() if "error" not in r]), "results": results, } @mcp.tool() async def smart_search( description: str, context: str = "genomics", filters: dict[str, Any] = None, max_results: int = 20, ) -> SearchResult: """ 智能语义搜索 - 理解自然语言描述并执行相应查询 语义理解示例: - "breast cancer genes on chromosome 17" → 查找17号染色体上的乳腺癌基因 - "TP53 protein interactions" → 查找TP53蛋白相互作用 - "tumor suppressor genes" → 查找肿瘤抑制基因 - "genes related to DNA repair" → 查找DNA修复相关基因 Args: description: 自然语言描述 context: 搜索上下文(genomics/proteomics/pathway) filters: 过滤条件 max_results: 最大结果数 Returns: 智能搜索结果 Examples: smart_search("breast cancer genes on chromosome 17") smart_search("TP53 protein interactions", context="proteomics") smart_search("DNA repair genes", filters={"species": "human"}) """ try: # 验证搜索参数 validated_description, validated_context, validated_max_results = ( validate_search_params( description=description, context=context, max_results=max_results ) ) # 智能解析查询意图 query = _apply_filters(validated_description, filters) # 根据上下文调整查询 if validated_context == "proteomics": query_type = "protein" elif validated_context == "pathway": query_type = "search" else: query_type = "auto" # 解析查询意图 parsed = QueryParser.parse(query, query_type) # 执行查询(直接使用查询执行器,避免MCP工具间调用) result = await _query_executor.execute( parsed, max_results=validated_max_results ) # 添加智能解析信息 result["smart_search_info"] = { "description": validated_description, "context": validated_context, "parsed_query": query, "filters_applied": filters is not None, } return result except ValidationError as e: return format_simple_error(e, query=description, operation="smart_search") except Exception as e: return format_simple_error(e, query=description, operation="smart_search") @mcp.tool() async def analyze_gene_evolution( gene_symbol: str, target_species: list[str] = None, analysis_level: str = "Eukaryota", include_sequence_info: bool = True, ) -> EvolutionResult: """ 基因进化分析工具 - MCP接口包装 Args: gene_symbol: 基因符号(如 TP53, BRCA1) target_species: 目标物种列表(如 ["mouse", "rat", "zebrafish"]) analysis_level: 分析层级(如 Eukaryota, Metazoa, Vertebrata) include_sequence_info: 是否包含序列信息 Returns: 进化分析结果 Examples: # 分析 TP53 在哺乳动物中的进化 analyze_gene_evolution("TP53", ["human", "mouse", "rat", "dog"]) """ try: # 验证基因分析参数 ( validated_gene_symbol, validated_target_species, validated_analysis_level, ) = validate_gene_params( gene_symbol=gene_symbol, target_species=target_species, analysis_level=analysis_level, ) return await _analyze_gene_evolution_internal( validated_gene_symbol, validated_target_species, validated_analysis_level, include_sequence_info, _query_executor, ) except ValidationError as e: return format_simple_error( e, query=gene_symbol, operation="analyze_gene_evolution" ) except Exception as e: return format_simple_error( e, query=gene_symbol, operation="analyze_gene_evolution" ) @mcp.tool() async def build_phylogenetic_profile( gene_symbols: list[str], species_set: list[str] = None, include_domain_info: bool = True, ) -> PhylogeneticProfileResult: """ 系统发育图谱构建工具 - MCP接口包装 Args: gene_symbols: 基因符号列表 species_set: 物种集合(默认包含常用模式生物) include_domain_info: 是否包含结构域信息 Returns: 系统发育图谱数据 Examples: # 分析p53家族在脊椎动物中的分布 build_phylogenetic_profile(["TP53", "TP63", "TP73"], ["human", "mouse", "zebrafish"]) """ try: return await _build_phylogenetic_profile_internal( gene_symbols, species_set, include_domain_info, _query_executor ) except ValidationError as e: return format_simple_error( e, query=str(gene_symbols), operation="build_phylogenetic_profile" ) except Exception as e: return format_simple_error( e, query=str(gene_symbols), operation="build_phylogenetic_profile" ) @mcp.tool() async def kegg_pathway_enrichment( gene_list: list[str], organism: str = "hsa", pvalue_threshold: float = 0.05, min_gene_count: int = 2, ) -> KEGGResult: """ KEGG通路富集分析工具 - MVP版本 分析基因列表在KEGG通路中的富集情况,识别显著相关的生物学通路 Args: gene_list: 基因列表(如 ["TP53", "BRCA1", "BRCA2"]) organism: 生物体代码(默认 "hsa" 人类) pvalue_threshold: p值显著性阈值(默认 0.05) min_gene_count: 通路中最小基因数量(默认 2) Returns: 通路富集分析结果,包含: - 显著富集的通路列表 - p值和FDR校正后的统计显著性 - 富集倍数和基因数量信息 - 分析参数和元数据 Examples: # 分析癌症相关基因的通路富集 kegg_pathway_enrichment(["TP53", "BRCA1", "BRCA2", "EGFR"]) # 分析小鼠基因的通路富集 kegg_pathway_enrichment(["Trp53", "Brca1"], organism="mmu") # 使用更严格的显著性阈值 kegg_pathway_enrichment(["TP53", "BRCA1"], pvalue_threshold=0.01) """ try: # 验证KEGG分析参数 ( validated_gene_list, validated_organism, validated_pvalue_threshold, validated_min_gene_count, ) = validate_kegg_params( gene_list=gene_list, organism=organism, pvalue_threshold=pvalue_threshold, min_gene_count=min_gene_count, ) # 使用QueryParser解析为通路富集查询 parsed = QueryParser.parse( validated_gene_list, query_type="pathway_enrichment" ) # 更新参数 parsed.params.update( { "gene_list": validated_gene_list, "organism": validated_organism, "pvalue_threshold": validated_pvalue_threshold, "min_gene_count": validated_min_gene_count, } ) # 执行查询 result = await _query_executor.execute(parsed) # 格式化结果 if "result" in result: enrichment_data = result["result"] # 添加查询信息 enrichment_data["query_info"] = { "gene_list": validated_gene_list, "analysis_date": "2025-10-24", "organism": validated_organism, "method": "KEGG Pathway Enrichment", "parameters": { "pvalue_threshold": validated_pvalue_threshold, "min_gene_count": validated_min_gene_count, }, } return enrichment_data elif "error" in result: return { "error": result["error"], "query_genes": gene_list, "organism": organism, "suggestions": [ "检查基因ID格式是否正确", "确认生物体代码是否支持", "验证网络连接是否正常", ], } else: return { "error": "Unknown error occurred during pathway enrichment analysis", "query_genes": gene_list, "organism": organism, } except ValidationError as e: return format_simple_error( e, query=str(gene_list), operation="kegg_pathway_enrichment" ) except Exception as e: return format_simple_error( e, query=str(gene_list), operation="kegg_pathway_enrichment" )