Skip to main content
Glama

Parse documents to Markdown

parse_documents

Convert PDF, Office documents, images, and web pages to Markdown with OCR support and page range extraction.

Instructions

Convert PDF, Office (DOCX, PPTX), spreadsheets (XLSX in Flash mode), images, and http(s) URLs to Markdown using the MinerU cloud API (content is uploaded to mineru.net; do not use for data that must stay on-device). Does not modify source files; may write Markdown under output_dir when saving results. Auth: without MINERU_API_TOKEN, Flash mode applies (Markdown-only, about 20 pages and 10 MB per file; service rate limits). With MINERU_API_TOKEN, higher limits and optional formats per plan. Use for extraction and conversion. Use get_ocr_languages only to list OCR language codes, not to parse files. Not for fully offline parsing. Parameters: file_sources is paths/URLs or objects with source and pages for PDF ranges; language is an OCR code (default ch); enable_ocr defaults to auto (null); set model to html only if every source is a web page URL.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_sourcesYesFiles to parse. Each entry is either: - a plain string: a local file path or URL - a dict {"source": "...", "pages": "N-M"}: with an optional page range Page range: "N" (single page) or "N-M" (for example "1-10"). PDF only. Duplicate sources are allowed, for example the same PDF with different ranges. Examples: ["report.pdf"] [{"source": "report.pdf", "pages": "1-5"}] [{"source": "a.pdf", "pages": "1-3"}, {"source": "a.pdf", "pages": "10-15"}] ["https://example.com/doc.pdf", "local.docx"]
enable_ocrNoOCR mode: null (default) - auto-detect: the server decides whether OCR is needed. true - force OCR on when the user mentions poor scan quality. false - disable OCR. Omit this parameter unless the user explicitly mentions scan quality issues.
languageNoOCR language code. Omit if unknown; the server defaults to "ch" (Chinese + English). Infer from the document filename when possible, for example "manual_en.pdf" -> "en". Common codes: "ch", "en", "japan", "korean", "latin", "arabic", "cyrillic", "devanagari". Full list: call get_ocr_languages.
modelNoParsing model. Set to "html" only when all file_sources are web page URLs. Otherwise omit it and let MinerU auto-select the appropriate model. Ignored in Flash mode.
output_dirNoDirectory used when parsed results need to be saved locally, such as batch parsing or oversized inline content. Defaults to the server-configured directory.

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses data uploaded to mineru.net (beyond annotations), non-destructive nature, and potential file writes. Annotations (readOnlyHint=false, destructiveHint=false, openWorldHint=true) are consistent with description; no contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise multi-sentence description with logical flow: function, caveats, usage guidance, parameter tips. No redundancy; each sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers complex tool with 5 params, auth modes, output behavior, and offline limitation. Output schema exists, so return value details are not needed. Complete for agent decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline 3. Description adds value by summarizing key usage hints for parameters (e.g., file_sources examples, enable_ocr auto-detection, model='html' condition). Minor improvement over schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Convert' and the resource 'PDF, Office, spreadsheets, images, and URLs to Markdown'. It distinguishes from sibling tool get_ocr_languages by noting its exclusive use for listing OCR codes, not parsing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('Use for extraction and conversion'), when not ('Not for fully offline parsing'), and mentions alternative (get_ocr_languages). Also covers auth-dependent behaviors and rate limits.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/opendatalab/MinerU-Ecosystem'

If you have feedback or need assistance with the MCP directory API, please join our Discord server