Skip to main content
Glama
aigo666

MCP Development Framework

parse_file

Extract text and data from PDF, Word, Excel, CSV, and Markdown files to process document content within the MCP Development Framework.

Instructions

解析文件内容,支持PDF、Word、Excel、CSV和Markdown格式

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYes文件的本地路径,例如'/path/to/document.pdf'

Implementation Reference

  • Registration of the 'parse_file' tool via @ToolRegistry.register decorator on FileTool class, including name assignment.
    @ToolRegistry.register
    class FileTool(BaseTool):
        """
        综合文件处理工具,根据文件扩展名自动选择合适的处理方式
        支持的文件类型:
        - PDF文件 (.pdf)
        - Word文档 (.doc, .docx)
        - Excel文件 (.xls, .xlsx, .xlsm)
        - CSV文件 (.csv)
        - Markdown文件 (.md)
        """
        
        name = "parse_file"
  • Input schema defining the required 'file_path' parameter for the tool.
    input_schema = {
        "type": "object",
        "required": ["file_path"],
        "properties": {
            "file_path": {
                "type": "string",
                "description": "文件的本地路径,例如'/path/to/document.pdf'",
            }
        },
    }
  • The handler function that executes the tool: validates input, processes file path, determines file type by extension, delegates to specialized sub-tools (PdfTool, WordTool, etc.), and handles errors.
    async def execute(self, arguments: Dict[str, Any]) -> List[types.TextContent | types.ImageContent | types.EmbeddedResource]:
        """
        解析文件内容
        
        Args:
            arguments: 参数字典,必须包含'file_path'键
        
        Returns:
            解析结果列表
        """
        if "file_path" not in arguments:
            return [types.TextContent(
                type="text",
                text="错误: 缺少必要参数 'file_path'"
            )]
        
        file_path = arguments["file_path"]
        # 处理文件路径,支持挂载目录的转换
        file_path = self.process_file_path(file_path)
        
        if not os.path.exists(file_path):
            return [types.TextContent(
                type="text",
                text=f"错误: 文件不存在: {file_path}"
            )]
        
        # 获取文件扩展名(转换为小写)
        file_ext = os.path.splitext(file_path)[1].lower()
        
        try:
            # 根据文件扩展名选择处理工具
            if file_ext == '.pdf':
                return await self.pdf_tool.execute(arguments)
            elif file_ext in ['.doc', '.docx']:
                return await self.word_tool.execute(arguments)
            elif file_ext in ['.xls', '.xlsx', '.xlsm']:
                return await self.excel_tool.execute(arguments)
            elif file_ext == '.csv':
                return await self.csv_tool.execute(arguments)
            elif file_ext == '.md':
                return await self.markdown_tool.execute(arguments)
            else:
                return [types.TextContent(
                    type="text",
                    text=f"错误: 不支持的文件类型: {file_ext}"
                )]
        except Exception as e:
            error_details = traceback.format_exc()
            return [types.TextContent(
                type="text",
                text=f"错误: 处理文件时发生错误: {str(e)}\n{error_details}"
            )] 

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/aigo666/mcp-framework'

If you have feedback or need assistance with the MCP directory API, please join our Discord server