Which integrations are available for this server?

Provides OCR capabilities using Yandex Vision API, allowing text extraction from images and PDFs with support for multiple recognition models and languages.

How do I use yandex-vision-ocr-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@yandex-vision-ocr-mcp extract text from receipt.png" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

yandex-vision-ocr-mcp

by chupre

Overview Schema Related Servers Score Discussions

TypeScript

Hybrid

Yandex Vision OCR MCP

npm version npm downloads License: MIT Node Tests

A Model Context Protocol (MCP) server that exposes Yandex Vision OCR as tools, so any MCP-compatible client — opencode, Claude Desktop, Cursor, Cline — can extract text from images and PDFs.

Features

recognize_text — synchronous OCR for images (JPEG/PNG/WEBP/HEIC/HEIF) and single-page PDFs.
recognize_pdf — asynchronous OCR for PDFs (single- or multi-page) and large files, via recognizeTextAsync + getRecognition polling.
Recognition models — printed text, multi-column, handwritten, tables, Markdown, and math formulas (LaTeX), selectable per call.
Accepts a local file path or raw base64 content.
Recognition languages selectable between ru and en (default ru; combine as ["ru","en"] for mixed text). Auto-detect is not supported by this endpoint.
Three output formats: text (default), markdown, or full json (the raw textAnnotation with blocks/lines/words/tables/entities).
Zero-touch error handling — API failures are returned as isError tool results, never crashes.
Lazy credentials — the server boots and lists tools even before YANDEX_* env vars are set, surfacing a clear error only on the first call.

Related MCP server: Tesseract PDF MCP Server

Prerequisites

Node.js ≥ 20.
A Yandex Cloud account with the Vision/OCR API enabled.
A folder ID + either an API key (recommended) or an IAM token. See the authentication docs.

Quick start

# Run directly with npx (no install needed)
npx -y yandex-vision-ocr-mcp

Then wire it into your MCP client (see Configuration).

Configuration

The server reads credentials from environment variables:

Variable	Required	Description
`YANDEX_FOLDER_ID`	optional	Yandex Cloud folder ID. Only sent as `x-folder-id` when set — the API key already scopes requests, so you can usually leave this unset. If set, it must match the key's folder.
`YANDEX_API_KEY`	one of	API key (recommended for long-lived usage).
`YANDEX_IAM_TOKEN`	one of	Short-lived IAM token (~12h). Use instead of an API key.

See .env.example for a template.

Models

Pass model to any tool to pick the recognition behaviour:

Model	Best for
`page` (default)	Single-column printed text.
`page-column-sort`	Multi-column printed text.
`handwritten`	Mixed handwritten + printed text (Russian, English).
`table`	Tables (Russian, English).
`markdown`	Printed text, also returned as Markdown.
`math-markdown`	Math formulas, returned as Markdown with LaTeX (e.g. $a^2 + b^2$ ).

Tip: use format: "markdown" together with the markdown / math-markdown models to receive the model's Markdown output directly.

Tools

Both tools accept the same input shape:

Argument	Type	Default	Description
`path`	string	—	Local file to OCR. Provide this or `base64`.
`base64`	string	—	Base64 content (`data:` URIs accepted). Provide this or `path`.
`mimeType`	string	inferred	Explicit MIME type override.
`languages`	string[]	`["ru"]`	Recognition languages, selectable: `ru`, `en` (e.g. `["ru","en"]` for mixed).
`model`	string	`page`	Recognition model — see Models.
`format`	`text` \| `markdown` \| `json`	`text`	Output format.

Supported formats: JPEG, PNG, WEBP, HEIC, HEIF (images) and PDF. The mimeType sent to the API is derived automatically (you can pass a standard MIME type via mimeType if needed). BMP/TIFF are not supported by the service.

recognize_text — synchronous. Best for images and single-page PDFs.
recognize_pdf — asynchronous (submit + poll). Best for multi-page PDFs and large files. Requires the input to be a PDF.

Example result (`text` format)

Hello World
Yandex OCR

Connect to opencode

Add the server to your opencode.json under mcp:

{
  "mcp": {
    "yandex-vision-ocr": {
      "type": "local",
      "command": ["npx", "-y", "yandex-vision-ocr-mcp@latest"],
      "enabled": true,
      "environment": {
        "YANDEX_FOLDER_ID": "b1g...",
        "YANDEX_API_KEY": "your-api-key"
      }
    }
  }
}

If you cloned the repo instead, replace the command with ["node", "/absolute/path/to/yandex-vision-ocr-mcp/build/index.js"].

Connect to Claude Desktop / Cursor / Cline

{
  "mcpServers": {
    "yandex-vision-ocr": {
      "command": "npx",
      "args": ["-y", "yandex-vision-ocr-mcp@latest"],
      "env": {
        "YANDEX_FOLDER_ID": "b1g...",
        "YANDEX_API_KEY": "your-api-key"
      }
    }
  }
}

{
  "mcpServers": {
    "yandex-vision-ocr": {
      "command": "npx",
      "args": ["-y", "yandex-vision-ocr-mcp@latest"],
      "env": {
        "YANDEX_FOLDER_ID": "b1g...",
        "YANDEX_API_KEY": "your-api-key"
      }
    }
  }
}

Local development

git clone https://github.com/chupre/yandex-vision-ocr-mcp.git
cd yandex-vision-ocr-mcp
npm install
npm run build      # type-check + compile to build/
npm test           # run the vitest suite
npm run dev        # run the server from source via tsx
npm run inspector  # open the MCP Inspector UI against the build

Useful scripts:

Script	Description
`npm run build`	Compile TypeScript to `build/`.
`npm run typecheck`	Type-check without emitting.
`npm test`	Run the offline test suite.
`npm run dev`	Run the server from source (tsx).
`npm run inspector`	Launch the MCP Inspector for manual testing.

Testing

The offline suite covers input handling, MIME inference, response formatting, the HTTP client (via a fake transport, no network), tool wiring, and a full MCP round-trip over an in-memory transport.

Live integration tests hit the real Yandex OCR API and are skipped unless credentials and sample files are provided:

YANDEX_FOLDER_ID=... YANDEX_API_KEY=... \
YOCR_LIVE_IMAGE=./sample.png \
YOCR_LIVE_PDF=./sample.pdf \
npx vitest run tests/live.test.ts

Docker

docker build -t yandex-vision-ocr-mcp .
docker run --rm -i \
  -e YANDEX_FOLDER_ID=b1g... \
  -e YANDEX_API_KEY=... \
  yandex-vision-ocr-mcp

API coverage

This server targets the Yandex Cloud Vision OCR REST API (ocr.api.cloud.yandex.net/ocr/v1):

Route	Method	Used for
`/recognizeText`	POST	Synchronous recognition (`recognize_text`).
`/recognizeTextAsync`	POST	Start async recognition (`recognize_pdf`).
`/getRecognition`	GET	Poll for the async result.

Concepts: OCR overview · image · PDF · handwritten.

License

MIT

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/chupre/yandex-vision-ocr-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

Yandex Vision OCR MCP

Features

Prerequisites

Quick start

Configuration

Models

Tools

Example result (text format)

Connect to opencode

Connect to Claude Desktop / Cursor / Cline

Local development

Testing

Docker

API coverage

License

Maintenance

Resources

Looking for Admin?

Tools

Latest Blog Posts

MCP directory API

Example result (`text` format)