Skip to main content
Glama

build_phylogenetic_profile

Constructs phylogenetic profiles to analyze gene distribution across species, supporting evolutionary research and homologous gene analysis.

Instructions

系统发育图谱构建工具 - MCP接口包装

Args: gene_symbols: 基因符号列表 species_set: 物种集合(默认包含常用模式生物) include_domain_info: 是否包含结构域信息

Returns: 系统发育图谱数据

Examples: # 分析p53家族在脊椎动物中的分布 build_phylogenetic_profile(["TP53", "TP63", "TP73"], ["human", "mouse", "zebrafish"])

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
gene_symbolsYes
species_setNo
include_domain_infoNo

Implementation Reference

  • Core implementation of the build_phylogenetic_profile tool. Performs ortholog analysis for each gene using analyze_gene_evolution, constructs presence/absence matrix across species, computes family evolution analysis, and returns phylogenetic profile data.
    async def build_phylogenetic_profile( gene_symbols: list[str], species_set: list[str] = None, include_domain_info: bool = True, query_executor: QueryExecutor = None, ) -> dict[str, Any]: """ 构建系统发育图谱 - 分析多个基因在指定物种集合中的分布 用于研究: - 基因家族进化 - 物种特异性基因丢失 - 功能保守性分析 - 比较基因组学研究 Args: gene_symbols: 基因符号列表 species_set: 物种集合(默认包含常用模式生物) include_domain_info: 是否包含结构域信息 query_executor: 查询执行器实例 Returns: 系统发育图谱数据,包含存在/缺失矩阵和进化分析 Examples: # 分析p53家族在脊椎动物中的分布 build_phylogenetic_profile(["TP53", "TP63", "TP73"], ["human", "mouse", "zebrafish"]) """ if query_executor is None: query_executor = QueryExecutor() if species_set is None: species_set = [ "human", "mouse", "rat", "zebrafish", "fruitfly", "worm", "yeast", ] try: # 批量分析基因 results = {} service_unavailable_count = 0 for gene_symbol in gene_symbols: # 分析每个基因的同源关系 gene_result = await analyze_gene_evolution( gene_symbol, species_set, include_sequence_info=include_domain_info, query_executor=query_executor, ) # 检查服务状态 if ( gene_result.get("error") and gene_result.get("status") == "service_unavailable" ): service_unavailable_count += 1 results[gene_symbol] = gene_result # 如果所有基因都查询失败,返回服务不可用 if service_unavailable_count == len(gene_symbols): return { "error": "系统发育分析服务不可用", "gene_symbols": gene_symbols, "status": "service_unavailable", "message": "Ensembl API服务不可用,无法构建系统发育图谱", "suggestions": ["稍后重试", "检查网络连接", "确认基因符号正确"], "alternative_resources": [ { "name": "Ensembl Web界面", "url": "https://www.ensembl.org/Homo_sapiens/Search", "description": "在Ensembl网站手动搜索基因", }, { "name": "NCBI Gene", "url": "https://www.ncbi.nlm.nih.gov/gene", "description": "NCBI基因数据库", }, ], } # 构建存在/缺失矩阵 presence_matrix = _build_presence_absence_matrix(results, species_set) # 分析基因家族进化 family_analysis = _analyze_gene_family_evolution(results, species_set) return { "gene_symbols": gene_symbols, "species_set": species_set, "presence_matrix": presence_matrix, "family_analysis": family_analysis, "individual_results": results, "summary": { "total_genes": len(gene_symbols), "total_species": len(species_set), "successful_queries": len(gene_symbols) - service_unavailable_count, "failed_queries": service_unavailable_count, "conservation_patterns": _identify_conservation_patterns( presence_matrix ), }, } except Exception as e: return { "error": str(e), "gene_symbols": gene_symbols, "error_type": "phylogenetic_profile_error", "suggestions": [ "检查基因符号列表是否正确", "确认物种列表格式正确", "减少基因数量后重试", ], "troubleshooting": { "gene_count": len(gene_symbols), "species_count": len(species_set) if species_set else 0, "possible_causes": [ "某些基因符号不存在", "网络连接问题", "Ensembl API限制", ], }, }
  • MCP tool registration and wrapper handler for build_phylogenetic_profile. Registers the tool with FastMCP, performs validation, and delegates to the internal implementation.
    async def build_phylogenetic_profile( gene_symbols: list[str], species_set: list[str] = None, include_domain_info: bool = True, ) -> PhylogeneticProfileResult: """ 系统发育图谱构建工具 - MCP接口包装 Args: gene_symbols: 基因符号列表 species_set: 物种集合(默认包含常用模式生物) include_domain_info: 是否包含结构域信息 Returns: 系统发育图谱数据 Examples: # 分析p53家族在脊椎动物中的分布 build_phylogenetic_profile(["TP53", "TP63", "TP73"], ["human", "mouse", "zebrafish"]) """ try: return await _build_phylogenetic_profile_internal( gene_symbols, species_set, include_domain_info, _query_executor ) except ValidationError as e: return format_simple_error( e, query=str(gene_symbols), operation="build_phylogenetic_profile" ) except Exception as e: return format_simple_error( e, query=str(gene_symbols), operation="build_phylogenetic_profile" )
  • Type definition (TypedDict) for the return type of build_phylogenetic_profile, defining the structure of the phylogenetic profile result.
    class PhylogeneticProfileResult(TypedDict): """系统发育图谱结果类型""" query_genes: list[str] phylogenetic_data: dict[str, list[dict[str, Any]]] domain_info: dict[str, list[dict[str, Any]]] | None profile_metadata: dict[str, Any]
  • Helper function that builds the presence/absence matrix from ortholog query results across the specified species set.
    def _build_presence_absence_matrix( results: dict, species_set: list[str] ) -> dict[str, dict]: """构建存在/缺失矩阵""" matrix = {} for gene_symbol, gene_result in results.items(): gene_row = {} orthologs_data = gene_result.get("result", {}).get("orthologs", []) present_species = set() # 标准化物种名称 for ortholog in orthologs_data: organism_name = ortholog.get("organism_name", "").lower() # 标准化物种名称(移除下划线,转换为小写) normalized_name = organism_name.replace("_", " ") present_species.add(normalized_name) for species in species_set: species_lower = species.lower() # 检查各种可能的物种名称格式 species_variants = [ species_lower, species_lower.replace(" ", "_"), species_lower.replace(" ", ""), ] gene_row[species] = any( variant in present_species or any(variant in present for present in present_species) for variant in species_variants ) matrix[gene_symbol] = gene_row return matrix
  • Helper function that analyzes gene family evolution by calculating conservation scores and identifying most/least conserved genes.
    def _analyze_gene_family_evolution( results: dict, species_set: list[str] ) -> dict[str, Any]: """分析基因家族进化""" # 计算基因保守性 conservation_scores = {} for gene_symbol, gene_result in results.items(): conservation_scores[gene_symbol] = _calculate_conservation_score(gene_result) # 识别保守性模式 most_conserved = max(conservation_scores.items(), key=lambda x: x[1]) least_conserved = min(conservation_scores.items(), key=lambda x: x[1]) return { "conservation_scores": conservation_scores, "most_conserved_gene": most_conserved[0], "least_conserved_gene": least_conserved[0], "conservation_range": most_conserved[1] - least_conserved[1], }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gqy20/genome-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server