Skip to main content
Glama

OCR-MCP

Complete AI OCR webapp and MCP server. A web app for people (drag‑and‑drop OCR, scanner, batch) and a FastMCP 3.1 MCP server for agentic IDEs—Claude, Cursor, Windsurf—so agents can run OCR, preprocessing, and workflows as tools. Same 10+ engines, WIA scanner (Windows), and pipelines; one repo.

Topics: ocr, mcp, fastmcp, document-processing, scanner, wia, pdf, computer-vision, model-context-protocol, llm

Version Python FastMCP License OCR Engines Scanner Web UI Status

What it does

  • Web app — React + FastAPI: upload or scan, pick engine, get text/PDF/JSON. Ports 10858 (frontend) and 10859 (backend).

  • MCP server — Tools for OCR, preprocessing, scanner, workflows. Sampling and agentic workflow (SEP-1577) supported.

Features: 10+ backends (PaddleOCR-VL-1.5, DeepSeek-OCR-2, Mistral OCR, …) · Auto backend selection · Preprocessing (deskew, enhance, crop) · Layout & table extraction · Quality assessment · WIA scanner · Batch & pipelines · Multi-format export

Docs

Doc

Description

Install

Install, run MCP, Web UI (ports 10858/10859), client config

Technical

Architecture, tools, config, development, packaging

OCR models

Engines, capabilities, hardware (see also AI_MODELS.md)

AI features

Sampling, SEP-1577, agentic workflows, prompts

Also: JUSTFILE.md (just recipes) · OCR-MCP_MASTER_PLAN.md (roadmap) · tests/README.md (testing)

Quick start

uv sync
just run

Web UI: just webapphttp://localhost:10858

License

MIT — see LICENSE.

-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sandraschi/ocr-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server