parse_file

parse_file

Extract text and data from PDF, Word, Excel, CSV, and Markdown files to process document content within the MCP Development Framework.

Instructions

解析文件内容，支持PDF、Word、Excel、CSV和Markdown格式

Input Schema

TableJSON Schema

Name	Required	Description	Default
`file_path`	Yes	文件的本地路径，例如'/path/to/document.pdf'

Implementation Reference

mcp_tool/tools/file_tool.py:24-36 (registration)
Registration of the 'parse_file' tool via @ToolRegistry.register decorator on FileTool class, including name assignment.
@ToolRegistry.register class FileTool(BaseTool): """ 综合文件处理工具，根据文件扩展名自动选择合适的处理方式支持的文件类型： - PDF文件 (.pdf) - Word文档 (.doc, .docx) - Excel文件 (.xls, .xlsx, .xlsm) - CSV文件 (.csv) - Markdown文件 (.md) """ name = "parse_file"
mcp_tool/tools/file_tool.py:38-47 (schema)
Input schema defining the required 'file_path' parameter for the tool.
input_schema = { "type": "object", "required": ["file_path"], "properties": { "file_path": { "type": "string", "description": "文件的本地路径，例如'/path/to/document.pdf'", } }, }
mcp_tool/tools/file_tool.py:58-109 (handler)
The handler function that executes the tool: validates input, processes file path, determines file type by extension, delegates to specialized sub-tools (PdfTool, WordTool, etc.), and handles errors.
async def execute(self, arguments: Dict[str, Any]) -> List[types.TextContent | types.ImageContent | types.EmbeddedResource]: """ 解析文件内容 Args: arguments: 参数字典，必须包含'file_path'键 Returns: 解析结果列表 """ if "file_path" not in arguments: return [types.TextContent( type="text", text="错误: 缺少必要参数 'file_path'" )] file_path = arguments["file_path"] # 处理文件路径，支持挂载目录的转换 file_path = self.process_file_path(file_path) if not os.path.exists(file_path): return [types.TextContent( type="text", text=f"错误: 文件不存在: {file_path}" )] # 获取文件扩展名（转换为小写） file_ext = os.path.splitext(file_path)[1].lower() try: # 根据文件扩展名选择处理工具 if file_ext == '.pdf': return await self.pdf_tool.execute(arguments) elif file_ext in ['.doc', '.docx']: return await self.word_tool.execute(arguments) elif file_ext in ['.xls', '.xlsx', '.xlsm']: return await self.excel_tool.execute(arguments) elif file_ext == '.csv': return await self.csv_tool.execute(arguments) elif file_ext == '.md': return await self.markdown_tool.execute(arguments) else: return [types.TextContent( type="text", text=f"错误: 不支持的文件类型: {file_ext}" )] except Exception as e: error_details = traceback.format_exc() return [types.TextContent( type="text", text=f"错误: 处理文件时发生错误: {str(e)}\n{error_details}" )]

MCP Development Framework

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API