Skip to main content
Glama
aigo666

MCP Development Framework

parse_word

Parse Word documents to extract text, tables, and images from local file paths.

Instructions

解析Word文档内容,提取文本、表格和图片信息

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYesWord文档的本地路径,例如'/path/to/document.docx'

Implementation Reference

  • The execute method is the handler entry point for the parse_word tool. It validates the file_path argument, processes the path, and delegates to _parse_word_document.
    async def execute(self, arguments: Dict[str, Any]) -> List[types.TextContent | types.ImageContent | types.EmbeddedResource]:
        """
        解析Word文档
        
        Args:
            arguments: 参数字典,必须包含'file_path'键
            
        Returns:
            解析结果列表
        """
        if "file_path" not in arguments:
            return [types.TextContent(
                type="text",
                text="错误: 缺少必要参数 'file_path'"
            )]
        
        # 处理文件路径,支持挂载目录的转换
        file_path = self.process_file_path(arguments["file_path"])
        
        return await self._parse_word_document(file_path)
  • Core parsing logic: handles .doc (via LibreOffice conversion) and .docx formats, extracts document properties, paragraphs, tables, and images from Word documents.
    async def _parse_word_document(self, file_path: str) -> List[types.TextContent | types.ImageContent | types.EmbeddedResource]:
        """
        解析Word文档内容,支持.docx和.doc格式
        
        Args:
            file_path: Word文档路径
            
        Returns:
            Word文档内容列表
        """
        results = []
        temp_docx_path = None
        
        # 检查文件是否存在
        if not os.path.exists(file_path):
            return [types.TextContent(
                type="text",
                text=f"错误: 文件不存在: {file_path}\n请检查路径是否正确,并确保文件可访问。"
            )]
        
        # 检查文件扩展名
        if not file_path.lower().endswith(('.docx', '.doc')):
            return [types.TextContent(
                type="text",
                text=f"错误: 不支持的文件格式: {file_path}\n仅支持.docx和.doc格式的Word文档。"
            )]
        
        try:
            # 添加文件信息
            file_size_mb = os.path.getsize(file_path) / (1024 * 1024)
            
            # 处理.doc格式(Word 97-2003文档)
            if file_path.lower().endswith('.doc'):
                results.append(types.TextContent(
                    type="text",
                    text=f"# Word文档解析 (Word 97-2003 格式)\n\n文件大小: {file_size_mb:.2f} MB"
                ))
                
                # 检查LibreOffice是否可用
                if not self._is_libreoffice_installed():
                    return [types.TextContent(
                        type="text",
                        text="错误: 无法解析Word 97-2003 (.doc)格式。\n"
                             "系统未安装LibreOffice,无法进行格式转换。\n"
                             "请安装LibreOffice后重试,或将文档另存为.docx格式。"
                    )]
                
                try:
                    # 显示转换提示
                    results.append(types.TextContent(
                        type="text",
                        text="正在使用LibreOffice转换文档格式,请稍候..."
                    ))
                    
                    # 转换.doc到.docx
                    temp_docx_path = self._convert_doc_to_docx(file_path)
                    
                    # 更新文件路径为转换后的文件
                    file_path = temp_docx_path
                    
                    results.append(types.TextContent(
                        type="text",
                        text="文档格式转换完成,继续解析...\n"
                    ))
                except Exception as e:
                    return results + [types.TextContent(
                        type="text",
                        text=f"错误: {str(e)}\n"
                             f"建议:\n"
                             f"1. 确保已正确安装LibreOffice且可通过命令行访问\n"
                             f"2. 尝试手动将文档转换为.docx格式后重试\n"
                             f"3. 检查文档是否加密或损坏"
                    )]
            else:
                results.append(types.TextContent(
                    type="text",
                    text=f"# Word文档解析\n\n文件大小: {file_size_mb:.2f} MB"
                ))
            
            # 打开Word文档
            doc = docx.Document(file_path)
            
            # 提取文档属性
            properties = {}
            if hasattr(doc.core_properties, 'title') and doc.core_properties.title:
                properties['标题'] = doc.core_properties.title
            if hasattr(doc.core_properties, 'author') and doc.core_properties.author:
                properties['作者'] = doc.core_properties.author
            if hasattr(doc.core_properties, 'created') and doc.core_properties.created:
                properties['创建时间'] = str(doc.core_properties.created)
            if hasattr(doc.core_properties, 'modified') and doc.core_properties.modified:
                properties['修改时间'] = str(doc.core_properties.modified)
            if hasattr(doc.core_properties, 'comments') and doc.core_properties.comments:
                properties['备注'] = doc.core_properties.comments
            
            # 添加文档属性信息
            if properties:
                properties_text = "## 文档属性\n\n"
                for key, value in properties.items():
                    properties_text += f"- {key}: {value}\n"
                results.append(types.TextContent(
                    type="text",
                    text=properties_text
                ))
            
            # 提取文档内容
            content_text = "## 文档内容\n\n"
            
            # 处理段落
            paragraphs_count = len(doc.paragraphs)
            content_text += f"### 段落 (共{paragraphs_count}个)\n\n"
            
            for i, para in enumerate(doc.paragraphs):
                if para.text.strip():  # 只处理非空段落
                    content_text += f"{para.text}\n\n"
            
            # 处理表格
            tables_count = len(doc.tables)
            if tables_count > 0:
                content_text += f"### 表格 (共{tables_count}个)\n\n"
                
                for i, table in enumerate(doc.tables):
                    content_text += f"#### 表格 {i+1}\n\n"
                    
                    # 创建Markdown表格
                    rows = []
                    for row in table.rows:
                        cells = [cell.text.replace('\n', ' ').strip() for cell in row.cells]
                        rows.append(cells)
                    
                    if rows:
                        # 表头
                        content_text += "| " + " | ".join(rows[0]) + " |\n"
                        # 分隔线
                        content_text += "| " + " | ".join(["---"] * len(rows[0])) + " |\n"
                        # 表格内容
                        for row in rows[1:]:
                            content_text += "| " + " | ".join(row) + " |\n"
                        
                        content_text += "\n"
            
            # 添加文档内容
            results.append(types.TextContent(
                type="text",
                text=content_text
            ))
            
            # 提取图片信息和内容
            try:
                # 提取文档中的所有图片,并过滤掉嵌入的外部文档
                images = self._extract_images_from_word(doc)
                
                if images:
                    image_info = f"## 图片信息\n\n文档中包含 {len(images)} 张图片。\n\n"
                    results.append(types.TextContent(
                        type="text",
                        text=image_info
                    ))
                    
                    # 返回图片内容
                    for i, (image_id, image_bytes) in enumerate(images):
                        try:
                            # 获取图片MIME类型
                            mime_type = self._get_image_mime_type(image_bytes)
                            
                            # 将图片添加到结果中
                            image_base64 = self._encode_image_base64(image_bytes)
                            results.append(types.TextContent(
                                type="text",
                                text=f"### 图片 {i+1}\n\n"
                            ))
                            results.append(types.ImageContent(
                                type="image",
                                data=image_base64,
                                mimeType=mime_type
                            ))
                        except Exception as e:
                            # 记录图片处理错误但不中断
                            results.append(types.TextContent(
                                type="text",
                                text=f"注意: 图片 {i+1} 处理失败: {str(e)}"
                            ))
                else:
                    results.append(types.TextContent(
                        type="text",
                        text="## 图片信息\n\n文档中未包含图片或嵌入对象均不是有效图片。"
                    ))
            except Exception as img_error:
                results.append(types.TextContent(
                    type="text",
                    text=f"警告: 提取图片信息时出错: {str(img_error)}"
                ))
            
            # 添加处理完成的提示
            results.append(types.TextContent(
                type="text",
                text="Word文档处理完成!"
            ))
            
            return results
        except Exception as e:
            error_details = traceback.format_exc()
            return [types.TextContent(
                type="text",
                text=f"错误: 解析Word文档失败: {str(e)}\n"
                     f"可能的原因:\n"
                     f"1. 文件格式不兼容或已损坏\n"
                     f"2. 文件受密码保护\n"
                     f"3. 文件包含不支持的内容\n\n"
                     f"详细错误信息: {error_details}"
            )]
  • Input schema defining the required 'file_path' string parameter for the parse_word tool.
    input_schema = {
        "type": "object",
        "required": ["file_path"],
        "properties": {
            "file_path": {
                "type": "string",
                "description": "Word文档的本地路径,例如'/path/to/document.docx'",
            }
        },
    }
  • ToolRegistry.register decorator is applied to the WordTool class (line 21 of word_tool.py), which registers it with name 'parse_word' via ToolRegistry._tools[tool_class.name] = tool_class.
    _tools: Dict[str, Type[BaseTool]] = {}
    
    @classmethod
    def register(cls, tool_class: Type[BaseTool]) -> Type[BaseTool]:
        """注册工具"""
        cls._tools[tool_class.name] = tool_class
        return tool_class
  • get_tool_instances loads all registered tools (including parse_word) and creates instances for use by the server.
    def get_tool_instances() -> dict:
        """
        创建所有工具类的实例
        
        Returns:
            dict: 工具名称到工具实例的映射
        """
        tools = load_tools()
        tool_instances = {}
        
        for tool_class in tools:
            try:
                tool_instance = tool_class()
                tool_instances[tool_class.name] = tool_instance
            except Exception as e:
                print(f"Warning: Failed to instantiate tool {tool_class.name}: {e}")
        
        return tool_instances 
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description must fully disclose behavior. It states extraction of text, tables, and images but does not mention limitations (e.g., .doc vs .docx, formatting preservation, or whether images are saved or returned). This is minimal disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single, clear sentence with no unnecessary words. It efficiently conveys the tool's purpose and capabilities.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, the description provides the core functionality. However, it lacks details about the return format or structure, which the agent needs to handle the output.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear description and example for file_path. The description adds minimal extra meaning beyond the schema, meeting the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the verb 'parse' and the resource 'Word document', and lists the extracted elements (text, tables, images). It clearly distinguishes from sibling tools like parse_pdf or parse_csv.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like parse_pdf or parse_markdown. The description lacks context about the appropriate scenarios or prerequisites.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/aigo666/mcp-framework'

If you have feedback or need assistance with the MCP directory API, please join our Discord server