Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@OwlOCR MCPextract the text from /Users/username/Desktop/invoice.pdf"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
OwlOCR MCP
MCP (Model Context Protocol) server for PDF and image OCR on macOS. Supports two backends:
OwlOCR CLI - Higher accuracy (recommended)
Vision Framework - No external dependencies
Features
📄 PDF OCR - Extract text from PDF files page by page with separators
🖼️ Image OCR - Extract text from PNG, JPEG, and other image formats
🌏 Multi-language - Korean + English by default (configurable)
🔄 Dual Backend - Auto-selects OwlOCR if available, falls back to Vision Framework
⚡ Async - Non-blocking execution for MCP clients
Benchmark Results
Tested on a 4-page Korean theological document with Hebrew text:
Metric | Vision Framework | OwlOCR CLI |
Time | 9.87s | 9.30s |
Time/Page | 2.47s | 2.33s |
Word Accuracy | 85.62% | 91.79% |
Character Accuracy | 94.46% | 95.07% |
Winner: OwlOCR CLI - Faster and more accurate.
Requirements
macOS (uses Apple Vision Framework / OwlOCR.app)
Python 3.11+
OwlOCR.app (optional, for better accuracy)
Installation
Using uv (recommended)
git clone https://github.com/yourusername/owlocr-mcp.git
cd owlocr-mcp
uv syncUsing pip
git clone https://github.com/yourusername/owlocr-mcp.git
cd owlocr-mcp
pip install -e .MCP Client Configuration
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"owlocr": {
"command": "uv",
"args": ["run", "--directory", "/path/to/owlocr-mcp", "owlocr-mcp"]
}
}
}Generic MCP Client
{
"mcpServers": {
"owlocr": {
"command": "/path/to/owlocr-mcp/.venv/bin/python",
"args": ["-m", "owlocr_mcp.server"]
}
}
}Available Tools
ocr_pdf_to_text
Extract text from a PDF file.
Parameters:
Parameter | Type | Default | Description |
| string | required | Absolute path to the PDF file |
| list[int] | null | Page numbers to process (1-based). If null, all pages |
| int | 200 | Resolution for rendering. Higher = better quality but slower |
| string | "auto" |
|
| list[string] | null | Language codes (Vision only). Default: |
Example:
Extract text from /Users/me/document.pdf using OwlOCROutput:
첫 번째 페이지 내용...
===== Page 2 =====
두 번째 페이지 내용...
--- OCR Complete: 2 page(s) processed using OwlOCR CLI ---ocr_image_to_text
Extract text from an image file.
Parameters:
Parameter | Type | Default | Description |
| string | required | Absolute path to the image file |
| string | "auto" |
|
| list[string] | null | Language codes (Vision only) |
check_ocr_backends
Check available OCR backends on the system.
Output:
OCR Backend Status:
✅ Vision Framework: Available (macOS built-in)
✅ OwlOCR CLI: Available (/Applications/OwlOCR.app)
Recommendation: Use backend='owlocr' for best accuracyBackend Selection
Backend | Accuracy | Speed | Requirements |
| ⭐⭐⭐⭐⭐ | ⭐⭐⭐⭐ | OwlOCR.app installed |
| ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | None (macOS built-in) |
| Best available | - | Uses OwlOCR if available |
Running the Benchmark
Compare backends on your own PDF:
# Both backends
uv run python benchmark.py /path/to/your.pdf
# With accuracy comparison (requires ground truth)
uv run python benchmark.py /path/to/your.pdf --show-text
# Specific backend only
uv run python benchmark.py /path/to/your.pdf --method owlocr
uv run python benchmark.py /path/to/your.pdf --method visionProject Structure
owlocr-mcp/
├── src/owlocr_mcp/
│ ├── __init__.py
│ ├── server.py # MCP server with tools
│ ├── ocr.py # Vision Framework backend
│ ├── ocr_owlocr.py # OwlOCR CLI backend
│ └── pdf.py # PDF processing utilities
├── benchmark.py # Performance comparison script
├── pyproject.toml
└── README.mdHow It Works
OwlOCR Backend
Render PDF pages to PNG using
pypdfium2Copy images to OwlOCR sandbox:
~/Library/Containers/JonLuca-DeCaro.OwlOCR/Data/tmp/Run CLI:
/Applications/OwlOCR.app/Contents/MacOS/OwlOCR --cli --input <file>Combine results with page separators
Vision Framework Backend
Render PDF pages to PNG using
pypdfium2Load as
CIImagevia PyObjCCreate
VNRecognizeTextRequestwith accurate recognition levelProcess with
VNImageRequestHandlerSort results by position and combine
Troubleshooting
"OwlOCR.app not found"
Install OwlOCR from owlocr.com or use backend="vision".
File picker dialog appears
This happens when OwlOCR can't access files outside its sandbox. The MCP server handles this by copying files to the sandbox temp directory automatically.
Poor accuracy on specific languages
For Vision Framework, specify languages explicitly:
ocr_pdf_to_text(pdf_path, languages=["ja-JP", "en-US"])Supported language codes: ko-KR, en-US, ja-JP, zh-Hans, zh-Hant, etc.
License
MIT License - see LICENSE file.
Acknowledgments
OwlOCR by JonLuca DeCaro
Apple Vision Framework
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.