Skip to main content
Glama

parse_documents

Convert PDF, Word, PPT, and image files to Markdown format from local paths or URLs with optional OCR and language support.

Instructions

统一接口,将文件转换为Markdown格式。支持本地文件和URL,会根据USE_LOCAL_API配置自动选择合适的处理方式。

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_sourcesYes文件路径或URL,支持以下格式: - 单个路径或URL: "/path/to/file.pdf" 或 "https://example.com/document.pdf" - 多个路径或URL(逗号分隔): "/path/to/file1.pdf, /path/to/file2.pdf" 或 "https://example.com/doc1.pdf, https://example.com/doc2.pdf" - 混合路径和URL: "/path/to/file.pdf, https://example.com/document.pdf" (支持pdf、ppt、pptx、doc、docx以及图片格式jpg、jpeg、png)
enable_ocrNo启用OCR识别,默认False
languageNo文档语言,默认"ch"中文,可选"en"英文等ch
page_rangesNo指定页码范围,格式为逗号分隔的字符串。例如:"2,4-6":表示选取第2页、第4页至第6页;"2--2":表示从第2页一直选取到倒数第二页。(远程API),默认None

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The 'parse_documents' tool handler implemented as a FastMCP tool. It handles file parsing by routing between local and remote processing based on configuration.
    @mcp.tool()
    async def parse_documents(
        file_sources: Annotated[
            str,
            Field(
                description="""文件路径或URL,支持以下格式:
                - 单个路径或URL: "/path/to/file.pdf" 或 "https://example.com/document.pdf"
                - 多个路径或URL(逗号分隔): "/path/to/file1.pdf, /path/to/file2.pdf" 或
                  "https://example.com/doc1.pdf, https://example.com/doc2.pdf"
                - 混合路径和URL: "/path/to/file.pdf, https://example.com/document.pdf"
                (支持pdf、ppt、pptx、doc、docx以及图片格式jpg、jpeg、png)"""
            ),
        ],
        enable_ocr: Annotated[bool, Field(description="启用OCR识别,默认False")] = False,
        language: Annotated[
            str, Field(description='文档语言,默认"ch"中文,可选"en"英文等')
        ] = "ch",
        page_ranges: Annotated[
            str | None,
            Field(
                description='指定页码范围,格式为逗号分隔的字符串。例如:"2,4-6":表示选取第2页、第4页至第6页;"2--2":表示从第2页一直选取到倒数第二页。(远程API),默认None'
            ),
        ] = None,
    ) -> Dict[str, Any]:
        """
        统一接口,将文件转换为Markdown格式。支持本地文件和URL,会根据USE_LOCAL_API配置自动选择合适的处理方式。
        """
        sources = parse_list_input(file_sources)
        if not sources:
            return {"status": "error", "error": "未提供有效的文件路径或URL"}
    
        sources = list(dict.fromkeys(sources))
    
        url_paths = []
        file_paths = []
    
        for source in sources:
            if source.lower().startswith(("http://", "https://")):
                url_paths.append(source)
            else:
                file_paths.append(source)
    
        results = []
        client = state.get_client()
        output_dir = state.output_dir
    
        if config.USE_LOCAL_API:
            results = await _handle_local_api(file_paths, enable_ocr)
        else:
            if url_paths:
                results.extend(
                    await _handle_remote_urls(client, url_paths, enable_ocr, language, page_ranges, output_dir)
                )
            if file_paths:
                results.extend(
                    await _handle_remote_files(client, file_paths, enable_ocr, language, page_ranges, output_dir)
                )
    
        if not results:
            return {"status": "error", "error": "未处理任何文件"}
    
        if len(results) == 1:
            result = results[0].copy()
            for key in ("filename", "source_path", "source_url"):
                result.pop(key, None)
            return result
    
        success_count = len([r for r in results if r.get("status") == "success"])
        error_count = len([r for r in results if r.get("status") == "error"])
    
        overall_status = "success"
        if success_count == 0:
            overall_status = "error"
        elif error_count > 0:
            overall_status = "partial_success"
    
        return {
            "status": overall_status,
            "results": results,
            "summary": {
                "total_files": len(results),
                "success_count": success_count,
                "error_count": error_count,
            },
        }
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It explains the tool's automatic processing behavior based on configuration and mentions supported file formats, but doesn't cover important aspects like error handling, rate limits, authentication requirements, or what happens with large files. The description adds some context but leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two sentences that each serve a purpose: stating the core function and explaining the processing approach. It's front-loaded with the main purpose, though the second sentence could be slightly more streamlined.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema (which handles return values), 100% schema description coverage, and no annotations, the description provides adequate context about what the tool does and how it processes files. However, for a document parsing tool with multiple parameters and no annotations, more behavioral context about limitations or edge cases would be beneficial.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description mentions support for local files and URLs but doesn't add meaningful parameter semantics beyond what's in the schema. This meets the baseline expectation when schema coverage is complete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '将文件转换为Markdown格式' (convert files to Markdown format). It specifies the unified interface approach and distinguishes itself from the sibling tool get_ocr_languages by focusing on document parsing rather than language retrieval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool: for converting files to Markdown, supporting both local files and URLs, with automatic processing based on USE_LOCAL_API configuration. However, it doesn't explicitly state when NOT to use it or mention alternatives beyond the sibling tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Tongzhao9417/mineru_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server