Parse PDF
crw_parse_fileConvert a base64-encoded PDF to markdown text. Scanned PDFs without OCR return empty markdown with a warning.
Instructions
Parse a local PDF (base64 in contentBase64) to markdown. No OCR: scanned PDFs return empty markdown with a warning.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| formats | No | Output formats (default ["markdown"]); json/summary need a server LLM | |
| parsers | No | Parsers to apply (default ["pdf"]) | |
| filename | No | Original filename (optional) | |
| maxLength | No | Max chars per content field; 0 = unbounded (default ~15000) | |
| jsonSchema | No | Optional. A JSON Schema (draft 2020-12) describing fields to extract when formats includes "json", e.g. {"type":"object","properties":{"title":{"type":"string"}}}. Free-form object. | |
| contentBase64 | Yes | Base64-encoded PDF bytes |