Skip to main content
Glama
jcm4TX
by jcm4TX

ocrmypdf-mcp

A minimal MCP server that exposes ocrmypdf as a single tool, ocr_pdf, so Claude can OCR scanned PDFs and then hand them to markitdown (or any text tool) for downstream work.

Why this exists: the obvious "just call ocrmypdf" approach falls over on Windows with the Microsoft Store (MSIX) build of Claude Desktop, because MSIX launches MCP servers with a stripped-down PATH that doesn't include Tesseract or Ghostscript. This server auto-detects the standard Windows install locations and prepends them to PATH at startup, so OCR Just Works without futzing with system environment variables.

Works on Linux and macOS too — the PATH augmentation is a no-op outside Windows.

Prerequisites (Windows)

Two system installers, then pip install.

1. Tesseract OCR

UB-Mannheim build (the standard Windows distribution): https://github.com/UB-Mannheim/tesseract/wiki

Accept the default install location (C:\Program Files\Tesseract-OCR). Add language packs during install if you need anything beyond English.

2. Ghostscript

AGPL release for Windows (free): https://www.ghostscript.com/releases/gsdnld.html

Accept the default install location (C:\Program Files\gs\gs<version>\).

3. Verify (optional)

tesseract --version
gswin64c --version

If either says "not recognized," reopen PowerShell so it picks up the updated PATH, then retry.

Related MCP server: Document Intelligence MCP Server

Install the server

git clone https://github.com/jcm4TX/ocrmypdf-mcp
cd ocrmypdf-mcp
pip install --user .

This installs ocrmypdf, the mcp SDK, and the ocrmypdf-mcp executable. On Windows it lands at:

C:\Users\<you>\AppData\Roaming\Python\Python313\Scripts\ocrmypdf-mcp.exe

Wire it up in Claude Desktop

Edit claude_desktop_config.json. On the MSIX (Microsoft Store) build of Claude Desktop, the path is:

%LOCALAPPDATA%\Packages\Claude_pzs8sxrjxfjjc\LocalCache\Roaming\Claude\claude_desktop_config.json

On the regular non-MSIX installer it's:

%APPDATA%\Claude\claude_desktop_config.json

Add an ocrmypdf-mcp entry under mcpServers:

{
  "mcpServers": {
    "ocrmypdf-mcp": {
      "command": "C:\\Users\\<you>\\AppData\\Roaming\\Python\\Python313\\Scripts\\ocrmypdf-mcp.exe",
      "args": []
    }
  }
}

Then fully quit Claude Desktop — right-click the tray icon and pick Quit, not just close the window — and relaunch.

Verify it loaded

In a new chat, ask "what MCP tools do you have for OCR?" — Claude should report ocr_pdf. If not, check the server log:

%LOCALAPPDATA%\Packages\Claude_pzs8sxrjxfjjc\LocalCache\Roaming\Claude\logs\mcp-server-ocrmypdf-mcp.log

Tool API

ocr_pdf(input_path, output_path?, language?, force_ocr?, deskew?)

Arg

Type

Default

Meaning

input_path

str

required

Absolute path to input PDF

output_path

str

<stem>-ocr.pdf next to input

Where to write the OCR'd PDF

language

str

"eng"

Tesseract language code; join multiple with +, e.g. "eng+spa"

force_ocr

bool

false

Re-OCR pages that already have a text layer

deskew

bool

true

Straighten skewed pages before OCR

Default behavior: pages without an existing text layer get OCR'd, pages that already have text pass through unchanged. Safe to run on mixed PDFs.

Typical workflow

  1. You hand Claude a scanned PDF path.

  2. Claude calls ocr_pdf(input_path="...").

  3. Claude calls markitdown.convert_to_markdown on the resulting -ocr.pdf.

  4. Claude reads the markdown and answers your question.

Known limitations

  • The MCP protocol enforces a per-request timeout (~4 minutes in current Claude Desktop). Large multi-page documents may exceed this and surface as a client-side timeout even though the underlying ocrmypdf process completes successfully — the output PDF will still be on disk. If you hit this regularly, split the input into smaller page ranges first.

  • Complex multi-column scanned layouts (legal, probate, ledgers) can produce messy markdown when piped to markitdown afterward, because Tesseract interprets visual alignment as table structure. Post-processing the markdown to drop empty table-pipe rows recovers most of it.

License

MIT

Install Server
A
license - permissive license
A
quality
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jcm4TX/ocrmypdf-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server