Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@ReadPDFx - OCR PDF MCP Serverextract text from the scanned contract.pdf in English"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
ReadPDFx - OCR PDF MCP Server
Official MCP SDK STDIO Server - MCP Protocol 2025-06-18 Compliant
β‘ Quick Start (STDIO Server)
1. Install Dependencies
2. Validate Installation
3. Client Integration
The server runs via STDIO protocol - configure your MCP client:
Claude Desktop:
π Features
π― Official MCP SDK: Built with official FastMCP framework
π‘ STDIO Transport: Standard MCP protocol over STDIO
π§ Smart PDF Processing: Automatically detects digital vs scanned content
π§ 5 OCR Tools: Text extraction, OCR processing, combined operations
π Universal Client Support: Claude Desktop, LM Studio, Continue.dev, Cursor
β‘ Lightweight: ~200 lines vs 800+ in HTTP implementation
π‘οΈ Production Ready: Comprehensive error handling and logging
π Auto Tool Registration: Decorators handle tool discovery
π§ Installation
Prerequisites
Python 3.8+
Tesseract OCR
Windows
macOS
Linux
π Available Tools
1. Smart PDF Processing
Intelligent processing with automatic OCR detection:
2. PDF Text Extraction
Direct text extraction from digital PDFs:
3. OCR Processing
OCR on image files:
4. PDF Structure Analysis
Analyze document structure and metadata:
5. Batch Processing
Process multiple files:
π Client Integration
Claude Desktop
Add to claude_desktop_config.json:
LM Studio
Configure MCP server with:
Command:
pythonArgs:
path/to/readpdfx/run.pyURL:
http://localhost:8000(HTTP mode)
Continue.dev
Add to config.json:
Cursor
Configure in settings.json:
π See
π API Endpoints
MCP Protocol Endpoints
POST /mcp/initialize- Initialize MCP sessionPOST /mcp/tools/list- List available toolsPOST /mcp/tools/call- Call MCP toolsGET /mcp/manifest- Get MCP manifest
HTTP Endpoints
GET /health- Health checkPOST /jsonrpc- JSON-RPC 2.0 endpointGET /docs- API documentationGET /tools- Tools discovery
π§ Configuration
Environment Variables
Config Files
mcp.json- MCP Protocol configurationmcp-config.yaml- YAML configurationpyproject.toml- Python project configpackage.json- Node.js compatibility
π³ Docker & Kubernetes
Docker Deployment
Quick Start with Docker
Automated Docker Deployment
Available Docker commands:
build- Build Docker image onlyrun- Build and run container (default)start- Start container (assumes image exists)stop- Stop running containerlogs- Show container logsclean- Stop container and remove imagestatus- Show container status
Kubernetes Deployment
Deploy to Kubernetes
Kubernetes Resources
Deployment:
k8s/deployment.yaml- Main application deploymentService:
k8s/deployment.yaml- Service exposureIngress:
k8s/ingress.yaml- External accessConfigMap:
k8s/configmap.yaml- Configuration managementHPA:
k8s/hpa.yaml- Horizontal Pod Autoscaler
Kubernetes Commands
Production Considerations
Multi-stage Build
Use Dockerfile.prod for optimized production builds:
Environment Variables
Persistent Storage
π§ͺ Testing
Run Tests
Manual Testing
π Performance
Startup Time: < 2 seconds
Memory Usage: ~50MB base
Throughput: 10+ PDFs/minute
Concurrent Requests: Up to 100
File Size Limit: 100MB per file
π οΈ Development
Development Mode
Project Structure
Adding New Tools
Define tool schema in
mcp_tools.pyImplement tool handler method
Register tool in
MCPToolsRegistryUpdate tests and documentation
π Troubleshooting
Common Issues
Server won't start
OCR not working
Permission errors
Ensure read access to PDF files
Check write permissions for output directory
Run with appropriate user privileges
Connection timeout
Verify server is running:
curl http://localhost:8000/healthCheck firewall settings
Try HTTP instead of direct MCP connection
Debug Mode
π Monitoring
Health Check
Metrics (Future)
Request count and latency
Tool usage statistics
Error rates and types
Resource utilization
π€ Contributing
Fork the repository
Create feature branch:
git checkout -b feature/new-toolMake changes and add tests
Submit pull request
Development Setup
π License
MIT License - see LICENSE file.
π Links
Repository: https://github.com/irev/mcp-readpdfx
Issues: https://github.com/irev/mcp-readpdfx/issues
Documentation: https://github.com/irev/mcp-readpdfx#readme
MCP Protocol: Model Context Protocol Specification
π Acknowledgments
MCP Protocol Team for the specification
FastAPI for the web framework
Tesseract OCR for text recognition
PyPDF2 and pdfplumber for PDF processing
Made with β€οΈ for the MCP community