recognize_text
Extract text from images (JPEG/PNG) or single-page PDFs using Yandex Vision OCR. Supports printed, handwritten, table, and markdown models with language selection.
Instructions
Recognize text in an image (JPEG/PNG) or a single-page PDF using Yandex Vision OCR (synchronous). Picks the recognition model via 'model' (default 'page' for printed text; 'handwritten', 'table', 'markdown', etc.). Provide a local file 'path' or 'base64' content. For multi-page/large PDFs, use recognize_pdf instead.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | No | Absolute or relative path to a local file to read and OCR. Provide this OR 'base64'. | |
| base64 | No | Base64-encoded file content. May be a data URI ('data:<mime>;base64,...'). Provide this OR 'path'. | |
| mimeType | No | Explicit MIME type override (e.g. 'image/jpeg'). Inferred from the file when omitted. | |
| languages | No | Recognition languages (selectable). Use ['ru'], ['en'], or ['ru','en'] for mixed text. Defaults to ['ru']. This Yandex OCR endpoint does not support auto-detect. | |
| model | No | Recognition model. 'page' (default): single-column printed text. 'page-column-sort': multi-column text. 'handwritten': mixed handwritten + printed (ru/en). 'table': tables (ru/en). 'markdown': printed text with Markdown output. 'math-markdown': math formulas (LaTeX in Markdown). | page |
| format | No | Output format. 'text' (default) returns recognized plain text. 'markdown' returns the model's Markdown output when available (markdown/math-markdown models), otherwise plain text. 'json' returns the full textAnnotation (blocks/lines/words/tables/markdown). | text |