Document OCR MCP Server
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Document OCR MCP Serverextract data from my Aadhaar card at C:/Users/me/aadhaar.jpg"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
🪪 Document OCR MCP Server
An AI-powered MCP (Model Context Protocol) server that extracts structured data from Indian identity documents using OCR. Connect it to Claude Desktop and let Claude read your documents!
📋 Supported Documents
Document | Extracted Fields |
🪪 Aadhaar Card | Name, DOB, Gender, Aadhaar Number*, Address, Pincode |
🛂 Passport | Name, Passport Number*, Nationality, DOB, Expiry, Sex, MRZ |
📄 PAN Card | Name, Father's Name, DOB, PAN Number* |
🚗 Driving License | Name, DOB, DL Number*, Validity, Address, Vehicle Classes |
📃 Any Image | Raw text + auto-detected document type + key-value pairs |
* Sensitive fields are masked by default for privacy.
Related MCP server: KnowledgeMCP
🚀 Quick Start
Step 1: Install Tesseract OCR (Windows)
Tesseract must be installed separately — it's the OCR engine under the hood.
Download from: https://github.com/UB-Mannheim/tesseract/wiki
Run the installer (choose Additional script data → Hindi if needed)
Default install path:
C:\Program Files\Tesseract-OCR\tesseract.exeAdd to PATH, or set in your environment:
$env:TESSDATA_PREFIX = "C:\Program Files\Tesseract-OCR\tessdata"
Step 2: Install Python Dependencies
cd d:\mcp
pip install -r requirements.txtStep 3: Test the Server
# Run the MCP Inspector (opens browser UI to test tools)
fastmcp dev server.pyOr test directly:
python server.py🤝 Connect to Claude Desktop
Find your Claude Desktop config file:
C:\Users\<YourName>\AppData\Roaming\Claude\claude_desktop_config.jsonAdd this to the config:
{ "mcpServers": { "document-ocr": { "command": "python", "args": ["d:\\mcp\\server.py"] } } }Restart Claude Desktop
You'll see the 🔌 tools icon — your OCR tools are ready!
💬 Example Claude Prompts
Once connected, you can ask Claude:
Extract all information from my Aadhaar card at C:/Users/me/aadhaar.jpgWhat's the expiry date on my passport? Image is at D:/docs/passport.pngRead the PAN card image at C:/scans/pan.jpg and tell me the PAN numberAuto-detect what type of document this is and extract all fields:
C:/Downloads/document.jpgGet raw text from this image: C:/photos/certificate.png🛠️ MCP Tools Reference
extract_aadhaar(image_path, show_full=False)
Extract data from Aadhaar card front or back.
extract_passport(image_path, show_full=False)
Extract data from passport bio-data page. Uses MRZ parsing for high accuracy.
extract_pan_card(image_path, show_full=False)
Extract data from PAN card.
extract_driving_license(image_path, show_full=False)
Extract data from Driving License (front side recommended).
extract_any_document(image_path, document_type="auto", show_full=False)
Auto-detect document type and extract accordingly.
ocr_raw_text(image_path, language="eng")
Get raw OCR text from any image. Supports multi-language:
"eng"— English"hin"— Hindi"eng+hin"— English + Hindi"eng+tam"— English + Tamil
🔒 Privacy & Security
Aadhaar numbers are masked to
XXXX XXXX 1234by defaultPAN numbers are partially masked to
AB*****4Fby defaultPassport numbers are partially masked by default
MRZ lines are redacted by default
Pass
show_full=Trueto any tool to disable maskingAll processing is 100% local — no data is sent to any cloud service
📁 Project Structure
d:\mcp\
├── server.py # FastMCP server (entry point)
├── requirements.txt # Python dependencies
├── pyproject.toml # Project config
│
├── tools/
│ ├── aadhaar.py # Aadhaar OCR
│ ├── passport.py # Passport OCR + MRZ parser
│ ├── pan_card.py # PAN Card OCR
│ ├── driving_license.py # Driving License OCR
│ └── generic_ocr.py # Generic + auto-detect OCR
│
├── utils/
│ ├── image_preprocess.py # OpenCV preprocessing pipeline
│ ├── validators.py # Pydantic output models
│ └── privacy.py # PII masking utilities
│
└── samples/ # Place test images here⚠️ Troubleshooting
Issue | Fix |
| Tesseract not in PATH — see Step 1 above |
Low accuracy on Hindi text | Install Hindi language pack for Tesseract |
| Run |
Image not readable | Check file path is absolute and image is not corrupted |
Missing fields in output | Image quality too low — try a higher resolution scan |
📜 License
MIT License — free to use and modify.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/shivamrawat2002/local-ocr-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server