recognize_pdf
Extract text from PDF documents (single or multi-page) using Yandex Vision OCR with asynchronous recognition. Supports models for page, table, handwritten, and markdown output.
Instructions
Recognize text in a PDF document (single or multi-page) using Yandex Vision OCR via asynchronous recognition (recognizeTextAsync + getRecognition polling). Use this for PDFs and large files. Provide a local file 'path' or 'base64' content, and pick a 'model' (default 'page').
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | No | Absolute or relative path to a local file to read and OCR. Provide this OR 'base64'. | |
| base64 | No | Base64-encoded file content. May be a data URI ('data:<mime>;base64,...'). Provide this OR 'path'. | |
| mimeType | No | Explicit MIME type override (e.g. 'image/jpeg'). Inferred from the file when omitted. | |
| languages | No | Recognition languages (selectable). Use ['ru'], ['en'], or ['ru','en'] for mixed text. Defaults to ['ru']. This Yandex OCR endpoint does not support auto-detect. | |
| model | No | Recognition model. 'page' (default): single-column printed text. 'page-column-sort': multi-column text. 'handwritten': mixed handwritten + printed (ru/en). 'table': tables (ru/en). 'markdown': printed text with Markdown output. 'math-markdown': math formulas (LaTeX in Markdown). | page |
| format | No | Output format. 'text' (default) returns recognized plain text. 'markdown' returns the model's Markdown output when available (markdown/math-markdown models), otherwise plain text. 'json' returns the full textAnnotation (blocks/lines/words/tables/markdown). | text |