Skip to main content
Glama

download_structure_tool

Download protein structure files from the RCSB Protein Data Bank in multiple formats (PDB, mmCIF, CIF, MMTF) for bioinformatics research and analysis.

Instructions

结构文件工具 - 下载和管理蛋白质结构文件

这个工具处理所有文件相关的操作,从下载到格式说明。

Args: pdb_id: PDB ID (例如: "5G53") file_format: 文件格式 - "pdb": 标准PDB格式 (推荐,人类可读) - "mmcif": 大分子晶体信息文件格式 (现代标准) - "cif": 晶体信息文件格式 - "mmtf": 大分子传输格式 (二进制,速度快) save_local: 是否保存到本地文件 (默认False返回内容) ctx: FastMCP Context,用于进度反馈和日志记录

Returns: 文件内容或下载信息 + 格式说明和使用指南

Examples: # 获取PDB文件内容 download_structure("1A3N")

# 下载mmCIF格式并保存到本地
download_structure("2HHB", "mmcif", True)

# 获取快速MMTF格式
download_structure("6VSB", "mmtf")

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pdb_idYes
file_formatNopdb
save_localNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • MCP tool handler for 'download_structure_tool'. Registers the tool with @mcp.tool() decorator and delegates execution to the download_structure helper function.
    @mcp.tool()
    async def download_structure_tool(
        pdb_id: str,
        file_format: str = "pdb",
        save_local: bool = False,
        ctx: Context | None = None,
    ) -> dict[str, Any]:
        """
        结构文件工具 - 下载和管理蛋白质结构文件
    
        这个工具处理所有文件相关的操作,从下载到格式说明。
    
        Args:
            pdb_id: PDB ID (例如: "5G53")
            file_format: 文件格式
                - "pdb": 标准PDB格式 (推荐,人类可读)
                - "mmcif": 大分子晶体信息文件格式 (现代标准)
                - "cif": 晶体信息文件格式
                - "mmtf": 大分子传输格式 (二进制,速度快)
            save_local: 是否保存到本地文件 (默认False返回内容)
            ctx: FastMCP Context,用于进度反馈和日志记录
    
        Returns:
            文件内容或下载信息 + 格式说明和使用指南
    
        Examples:
            # 获取PDB文件内容
            download_structure("1A3N")
    
            # 下载mmCIF格式并保存到本地
            download_structure("2HHB", "mmcif", True)
    
            # 获取快速MMTF格式
            download_structure("6VSB", "mmtf")
        """
        return await download_structure(pdb_id, file_format, save_local, ctx)
  • Core helper function implementing the download logic: validates PDB ID and format, downloads from RCSB, handles local save or content preview, provides format info, and formats MCP response.
    async def download_structure(
        pdb_id: str, file_format: str = "pdb", save_local: bool = False, ctx: Context | None = None
    ) -> dict[str, Any]:
        """
        结构文件工具 - 下载和管理蛋白质结构文件
    
        这个工具处理所有文件相关的操作,从下载到格式说明。
    
        Args:
            pdb_id: PDB ID (例如: "5G53")
            file_format: 文件格式
                - "pdb": 标准PDB格式 (推荐,人类可读)
                - "mmcif": 大分子晶体信息文件格式 (现代标准)
                - "cif": 晶体信息文件格式
                - "mmtf": 大分子传输格式 (二进制,速度快)
            save_local: 是否保存到本地文件 (默认False返回内容)
            ctx: FastMCP Context,用于进度反馈和日志记录
    
        Returns:
            文件内容或下载信息 + 格式说明和使用指南
        """
        try:
            if ctx:
                await ctx.info(f"📁 开始下载结构文件: {pdb_id}.{file_format}")
                await ctx.report_progress(0, 100, "初始化下载...")
            if not validate_pdb_id(pdb_id):
                return format_error_response(
                    "无效的PDB ID格式",
                    f"期望格式: 4位字符 (首位数字,后三位可数字可字母),实际: {pdb_id}",
                )
    
            # 验证PDB ID存在性
            if not _validate_pdb_exists(pdb_id):
                if ctx:
                    await ctx.error(f"❌ PDB ID {pdb_id} 不存在")
                return format_error_response("PDB ID不存在", f"PDB ID {pdb_id} 在RCSB数据库中未找到")
    
            # 验证文件格式
            supported_formats = get_supported_formats()
            if file_format not in supported_formats:
                if ctx:
                    await ctx.error(f"❌ 不支持的文件格式: {file_format}")
                return format_error_response(
                    "不支持的文件格式", f"支持格式: {', '.join(supported_formats)}"
                )
    
            # 构建下载URL
            download_url = f"{RCSB_DOWNLOAD_URL}/{pdb_id}.{file_format}"
            local_filename = f"{pdb_id}.{file_format}"
    
            if ctx:
                await ctx.report_progress(50, 100, f"下载 {file_format.upper()} 格式文件...")
    
            # 下载文件
            if save_local:
                if ctx:
                    await ctx.info(f"💾 保存到本地: {local_filename}")
    
                success = download_file(download_url, local_filename)
                if success:
                    result_data = {
                        "pdb_id": pdb_id,
                        "file_format": file_format,
                        "file_path": local_filename,
                        "download_method": "saved_local",
                        "file_size": None,  # 可以添加文件大小信息
                    }
                    if ctx:
                        await ctx.info(f"✅ 文件保存成功: {local_filename}")
                else:
                    if ctx:
                        await ctx.error(f"❌ 文件下载失败: {local_filename}")
                    return format_error_response(
                        "文件下载失败", f"无法下载 {pdb_id}.{file_format} 文件"
                    )
            else:
                # 返回文件内容(对于小文件)
                try:
                    if ctx:
                        await ctx.info("🌐 从远程获取文件内容预览...")
    
                    import requests
    
                    response = requests.get(download_url, timeout=30)
                    if response.status_code == 200:
                        result_data = {
                            "pdb_id": pdb_id,
                            "file_format": file_format,
                            "file_path": download_url,
                            "download_method": "url_provided",
                            "file_content": (
                                response.text[:1000] + "..."
                                if len(response.text) > 1000
                                else response.text
                            ),
                            "content_preview": True,
                        }
                        if ctx:
                            await ctx.info(f"✅ 文件内容获取成功 (预览: {len(response.text)} 字符)")
                    else:
                        if ctx:
                            await ctx.error(f"❌ HTTP错误: {response.status_code}")
                        return format_error_response(
                            "文件下载失败", f"HTTP {response.status_code}: 无法访问文件"
                        )
                except Exception as e:
                    if ctx:
                        await ctx.error(f"❌ 网络错误: {str(e)}")
                    return format_error_response("网络错误", f"下载失败: {str(e)}")
    
            # 添加格式信息
            format_info = {
                "pdb": {
                    "name": "Protein Data Bank (PDB) 格式",
                    "description": "经典的文本格式,人类可读",
                    "recommended": True,
                    "use_case": "一般用途,化学演示,小分子结构",
                    "advantages": ["人类可读", "广泛支持", "适合编辑"],
                },
                "mmcif": {
                    "name": "大分子晶体信息文件格式",
                    "description": "现代的XML风格格式,更灵活",
                    "recommended": True,
                    "use_case": "复杂结构,大批量数据,现代软件",
                    "advantages": ["更详细", "支持复杂数据", "现代标准"],
                },
                "cif": {
                    "name": "晶体信息文件格式",
                    "description": "标准化格式, mmcif的简化版",
                    "recommended": False,
                    "use_case": "基本晶体学数据",
                    "advantages": ["标准化", "简洁"],
                },
                "mmtf": {
                    "name": "大分子传输格式",
                    "description": "二进制格式,压缩高效",
                    "recommended": True,
                    "use_case": "大批量传输,高性能应用",
                    "advantages": ["文件小", "加载快", "压缩效率高"],
                },
            }
    
            result_data["format_info"] = format_info.get(
                file_format,
                {
                    "name": f"{file_format.upper()} 格式",
                    "description": "支持的文件格式",
                    "recommended": False,
                    "use_case": "通用格式",
                    "advantages": ["标准支持"],
                },
            )
    
            if ctx:
                await ctx.report_progress(100, 100, "完成")
                await ctx.info(f"✅ 成功获取 {pdb_id} 的 {file_format} 格式文件")
    
            return format_success_response(
                result_data,
                f"成功获取 {pdb_id} 的 {file_format} 格式文件。{format_info.get(file_format, {}).get('description', '')}",
            )
    
        except Exception as e:
            if ctx:
                await ctx.error(f"❌ 文件操作失败: {str(e)}")
            return format_error_response("文件操作错误", f"download_structure 执行失败: {str(e)}")
  • Invocation of register_all_tools(mcp) in server creation, which defines and registers the download_structure_tool using @mcp.tool().
    # 注册所有工具
    register_all_tools(mcp)
  • The register_all_tools function where the download_structure_tool is defined and registered via @mcp.tool() decorator.
    def register_all_tools(mcp) -> None:
        """
        注册3个核心整合工具到FastMCP服务器
    
        优化后的工具设计:
        1. find_protein_structures - 蛋白质结构发现工具
        2. get_protein_data - 蛋白质综合数据工具
        3. download_structure - 结构文件工具
    
        Args:
            mcp: FastMCP服务器实例
        """
    
        # 工具1: 蛋白质结构发现工具 - 整合搜索、示例、验证功能
        @mcp.tool()
        async def find_protein_structures_tool(
            keywords: str | None = None,
            category: str | None = None,
            pdb_id: str | None = None,
            max_results: int = 10,
            ctx: Context | None = None,
        ) -> dict[str, Any]:
            """
            蛋白质结构发现工具 - 搜索、示例、验证的统一入口
    
            这是蛋白质研究的起点,帮助你发现和验证PDB结构。
    
            Args:
                keywords: 搜索关键词 (如: "hemoglobin", "kinase", "DNA")
                category: 预设类别 ("癌症靶点", "病毒蛋白", "酶类", "抗体", "膜蛋白", "核糖体")
                pdb_id: 直接验证或查看特定PDB ID (如: "1A3N")
                max_results: 搜索结果最大数量 (默认10,最大100)
                ctx: FastMCP Context,用于进度反馈和日志记录
    
            Returns:
                包含PDB结构列表、验证结果、示例数据的综合响应
    
            Examples:
                # 搜索血红蛋白相关结构
                find_protein_structures(keywords="hemoglobin")
    
                # 获取癌症靶点示例
                find_protein_structures(category="癌症靶点")
    
                # 验证PDB ID
                find_protein_structures(pdb_id="1A3N")
            """
            return await find_protein_structures(keywords, category, pdb_id, max_results, ctx)
    
        # 工具2: 蛋白质综合数据工具 - 一次获取所有蛋白质信息
        @mcp.tool()
        async def get_protein_data_tool(
            pdb_id: str,
            data_types: list[str] | None = None,
            chain_id: str | None = None,
            ctx: Context | None = None,
        ) -> dict[str, Any]:
            """
            蛋白质综合数据工具 - 获取完整蛋白质信息包
    
            这个工具是蛋白质数据获取的核心,一次性获取你需要的所有信息。
    
            Args:
                pdb_id: PDB ID (例如: "5G53")
                data_types: 需要的数据类型列表
                    - "basic": 基本信息 (标题、方法、分辨率等)
                    - "sequence": 氨基酸序列信息
                    - "structure": 二级结构分析
                    - "all": 获取所有数据
                chain_id: 特定链ID (例如: "A",可选)
                ctx: FastMCP Context,用于进度反馈和日志记录
    
            Returns:
                完整的蛋白质数据包,包含请求的所有数据类型
    
            Examples:
                # 获取所有数据
                get_protein_data("5G53", ["all"])
    
                # 只获取基本信息和序列
                get_protein_data("1A3N", ["basic", "sequence"])
    
                # 获取特定链的数据
                get_protein_data("2HHB", ["all"], "A")
            """
            # 如果没有指定数据类型,默认获取基本数据
            if data_types is None:
                data_types = ["basic", "sequence", "structure"]
            return await get_protein_data(pdb_id, data_types, chain_id, ctx)
    
        # 工具3: 结构文件工具 - 下载和管理蛋白质结构文件
        @mcp.tool()
        async def download_structure_tool(
            pdb_id: str,
            file_format: str = "pdb",
            save_local: bool = False,
            ctx: Context | None = None,
        ) -> dict[str, Any]:
            """
            结构文件工具 - 下载和管理蛋白质结构文件
    
            这个工具处理所有文件相关的操作,从下载到格式说明。
    
            Args:
                pdb_id: PDB ID (例如: "5G53")
                file_format: 文件格式
                    - "pdb": 标准PDB格式 (推荐,人类可读)
                    - "mmcif": 大分子晶体信息文件格式 (现代标准)
                    - "cif": 晶体信息文件格式
                    - "mmtf": 大分子传输格式 (二进制,速度快)
                save_local: 是否保存到本地文件 (默认False返回内容)
                ctx: FastMCP Context,用于进度反馈和日志记录
    
            Returns:
                文件内容或下载信息 + 格式说明和使用指南
    
            Examples:
                # 获取PDB文件内容
                download_structure("1A3N")
    
                # 下载mmCIF格式并保存到本地
                download_structure("2HHB", "mmcif", True)
    
                # 获取快速MMTF格式
                download_structure("6VSB", "mmtf")
            """
            return await download_structure(pdb_id, file_format, save_local, ctx)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses behavioral traits such as returning file content or download information with format explanations, and mentions progress feedback via ctx. However, it lacks details on error handling, rate limits, authentication needs, or what happens when save_local is true (e.g., file location, overwriting behavior). The description adds some context but is incomplete for a tool with mutation potential (saving files).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is moderately structured with sections (Args, Returns, Examples), but includes redundant elements like '结构文件工具 - 下载和管理蛋白质结构文件' which repeats the title concept. The examples are helpful but verbose, and some sentences could be more front-loaded (e.g., the purpose statement is clear but not maximally efficient). It earns its place with parameter details but has room for trimming.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (3 parameters, 0% schema coverage, no annotations, but has output schema), the description is fairly complete. It covers parameter meanings, return behavior, and usage examples. With an output schema present, it doesn't need to detail return values extensively. However, for a tool that can save files locally, it could better address potential side effects or error cases to be fully comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It provides meaningful semantics for all parameters: pdb_id is explained with an example ('PDB ID (例如: "5G53")'), file_format includes detailed options with recommendations ('推荐,人类可读' for pdb, '现代标准' for mmcif, '二进制,速度快' for mmtf), and save_local clarifies behavior ('是否保存到本地文件 (默认False返回内容)'). This adds significant value beyond the bare schema, though it could specify default values more explicitly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as '处理所有文件相关的操作,从下载到格式说明' (handles all file-related operations from download to format explanation), which is specific about downloading and managing protein structure files. It distinguishes from siblings by focusing on file operations rather than searching (find_protein_structures_tool) or general data retrieval (get_protein_data_tool). However, it could be more precise about being primarily a download tool with format management.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage through examples (e.g., '获取PDB文件内容' for getting content, '下载mmCIF格式并保存到本地' for saving locally), but lacks explicit guidance on when to use this tool versus alternatives like find_protein_structures_tool or get_protein_data_tool. It provides context for different scenarios (e.g., save_local parameter) but doesn't state when not to use it or compare with siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gqy20/protein-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server