Skip to main content
Glama

OCR-MCP

Complete AI OCR webapp and MCP server. A web app for people (draganddrop OCR, scanner, batch) and a FastMCP 3.1 MCP server for agentic IDEsClaude, Cursor, Windsurfso agents can run OCR, preprocessing, and workflows as tools. Same 13 engines, WIA scanner (Windows), and pipelines; one repo.

Topics: ocr, mcp, fastmcp, document-processing, scanner, wia, pdf, computer-vision, model-context-protocol, llm

Version Python FastMCP License OCR Engines Scanner Web UI Status

What it does

  • Web app React (web_sota/) + FastAPI (backend/app.py): upload or scan, pick engine, get text/PDF/JSON. Ports 10858 (Vite) and 10859 (API). In-app Help (/help) documents the web UI, the MCP server, and OCR backends.

  • MCP server FastMCP 3.1 stdio: tools for OCR, preprocessing, scanner, workflows. Sampling defaults to local Ollama (http://127.0.0.1:11434/v1, model llama3.2) no cloud API key. Set OCR_SAMPLING_USE_CLIENT_LLM=1 to use the host IDEs LLM instead. Mistral OCR uses MISTRAL_API_KEY when you call that backend. See AI_FEATURES.md.

Features: 13 backends (PaddleOCR-VL-1.5, Nemotron VL 8B, DeepSeek-OCR-2, Mistral OCR, ) Auto backend selection Preprocessing (deskew, enhance, crop) Layout & table extraction Quality assessment WIA scanner Batch & pipelines Multi-format export

Related MCP server: MCP PDF Reader Server

Docs

Doc

Description

Install

Install, run MCP, Web UI (start.ps1, ports 10858/10859), PyYAML notes, client config

Backend deps

Web FastAPI backend: same venv as ocr-mcp, pyproject.toml, PyTorch, OCR_AUTO_INSTALL_DEPS

Technical

Architecture, tools, config, development, packaging

OCR models

Engines, capabilities, hardware (see also AI_MODELS.md)

Backend requirements

Per-model pip packages, system deps, env/config

MCP toolset matrix

Portmanteau tools, operation status, corpus v0

AI features

Sampling, SEP-1577, agentic workflows, prompts

In-app Help

Source for /help: webapp vs MCP vs backends (mirrors INSTALL / TECHNICAL)

SOTA Compliance

Verified SOTA v12.0 Architecture

Also: JUSTFILE.md (just recipes) OCR-MCP_MASTER_PLAN.md (roadmap) tests/README.md (testing)

Quick Start

git clone https://github.com/sandraschi/ocr-mcp
cd ocr-mcp
just

This opens an interactive dashboard showing all available commands. Run just bootstrap to install dependencies, then just serve or just dev to start.

Manual Setup

If you don't have just installed:

🛡️ Industrial Quality Stack

This project adheres to SOTA 14.1 industrial standards for high-fidelity agentic orchestration:

  • Python (Core): Ruff for linting and formatting. Zero-tolerance for print statements in core handlers (T201).

  • Webapp (UI): Biome for sub-millisecond linting. Strict noConsoleLog enforcement.

  • Protocol Compliance: Hardened stdout/stderr isolation to ensure crash-resistant JSON-RPC communication.

  • Automation: Justfile recipes for all fleet operations (just lint, just fix, just dev).

  • Security: Automated audits via bandit and safety.

License

MIT see LICENSE.

A
license - permissive license
-
quality - not tested
A
maintenance

Maintenance

Maintainers
40dResponse time
Release cycle
Releases (12mo)
Commit activity

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sandraschi/ocr-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server