Skip to main content
Glama

parse_file

Extract and parse content from PDF, Word, Excel, CSV, and Markdown files for efficient data retrieval and processing in development workflows.

Instructions

解析文件内容,支持PDF、Word、Excel、CSV和Markdown格式

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYes文件的本地路径,例如'/path/to/document.pdf'

Implementation Reference

  • The async execute method implements the core logic of the 'parse_file' tool. It validates the file_path argument, processes the path, checks existence, determines file extension, and delegates to specialized tools (PdfTool, WordTool, etc.) based on the extension. Handles errors appropriately.
    async def execute(self, arguments: Dict[str, Any]) -> List[types.TextContent | types.ImageContent | types.EmbeddedResource]: """ 解析文件内容 Args: arguments: 参数字典,必须包含'file_path'键 Returns: 解析结果列表 """ if "file_path" not in arguments: return [types.TextContent( type="text", text="错误: 缺少必要参数 'file_path'" )] file_path = arguments["file_path"] # 处理文件路径,支持挂载目录的转换 file_path = self.process_file_path(file_path) if not os.path.exists(file_path): return [types.TextContent( type="text", text=f"错误: 文件不存在: {file_path}" )] # 获取文件扩展名(转换为小写) file_ext = os.path.splitext(file_path)[1].lower() try: # 根据文件扩展名选择处理工具 if file_ext == '.pdf': return await self.pdf_tool.execute(arguments) elif file_ext in ['.doc', '.docx']: return await self.word_tool.execute(arguments) elif file_ext in ['.xls', '.xlsx', '.xlsm']: return await self.excel_tool.execute(arguments) elif file_ext == '.csv': return await self.csv_tool.execute(arguments) elif file_ext == '.md': return await self.markdown_tool.execute(arguments) else: return [types.TextContent( type="text", text=f"错误: 不支持的文件类型: {file_ext}" )] except Exception as e: error_details = traceback.format_exc() return [types.TextContent( type="text", text=f"错误: 处理文件时发生错误: {str(e)}\n{error_details}" )]
  • The input_schema defines the expected input for the 'parse_file' tool, requiring a 'file_path' string parameter.
    input_schema = { "type": "object", "required": ["file_path"], "properties": { "file_path": { "type": "string", "description": "文件的本地路径,例如'/path/to/document.pdf'", } }, }
  • The FileTool class is registered using the @ToolRegistry.register decorator. It sets name='parse_file' and provides a description and input_schema.
    @ToolRegistry.register class FileTool(BaseTool): """ 综合文件处理工具,根据文件扩展名自动选择合适的处理方式 支持的文件类型: - PDF文件 (.pdf) - Word文档 (.doc, .docx) - Excel文件 (.xls, .xlsx, .xlsm) - CSV文件 (.csv) - Markdown文件 (.md) """ name = "parse_file" description = "解析文件内容,支持PDF、Word、Excel、CSV和Markdown格式"

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/aigo666/mcp-framework'

If you have feedback or need assistance with the MCP directory API, please join our Discord server