How do I use fast-paddleocr-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@fast-paddleocr-mcp extract text from receipt.jpg" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

fast-paddleocr-mcp

by trotsky1997

Overview Schema Related Servers Score Discussions

Python

Local

PaddleOCR-MCP

PaddleOCR MCP (Model Context Protocol) server and CLI tool that extracts text from images and outputs results in markdown format. Optimized for fast inference with GPU auto-detection.

MCP Server Configuration

The MCP (Model Context Protocol) server allows integration with MCP clients like Cursor, Claude Desktop, etc.

Use uvx directly (no installation required, automatically downloads from PyPI):

{
  "mcpServers": {
    "fast-paddleocr-mcp": {
      "command": "uvx",
      "args": ["fast-paddleocr-mcp"]
    }
  }
}

MCP Tool: `ocr_image`

The server provides a single tool called ocr_image that:

Input: image_path (string) - Path to the input image file
Output: Returns the path to the generated markdown file containing OCR results
Automatic optimizations: All performance optimizations are applied automatically with intelligent fallback
Default language: Uses 'ch' (Chinese and English) by default for maximum compatibility

Example: When called with image_path: "photo.png", it returns "photo.png.md" containing the recognized text.

Note: The server automatically applies all optimizations (HPI, GPU acceleration, image preprocessing, etc.) and falls back to simpler configurations if needed. No configuration required from the caller.

See MCP_README.md for detailed MCP server documentation.

Related MCP server: Nanonets MCP Server

Usage

Basic Usage

The tool is optimized for speed by default with the following settings:

Fast mode enabled (disables preprocessing for maximum speed)
PP-OCRv4 (faster mobile models)
640px image size limit (faster processing)
Auto GPU detection (uses GPU if available, falls back to CPU)

# Output will be saved as <image_name>.png.md
# Uses: fast mode + PP-OCRv4 + 640px + auto GPU detection
uvx --from . paddleocr-md image.png

# Specify custom output path
uvx --from . paddleocr-md image.png -o result.md

# Force CPU mode
uvx --from . paddleocr-md image.png --cpu

# Disable fast mode for better accuracy on rotated text
uvx --from . paddleocr-md image.png --no-fast

# Use PP-OCRv5 for better accuracy (slower)
uvx --from . paddleocr-md image.png --ocr-version PP-OCRv5

Default Optimization Settings

The MCP server is optimized for low latency by default with these settings:

✅ Fast mode enabled: Disables textline orientation classification (skips one model)
✅ PP-OCRv4: Uses faster mobile models (PP-OCRv4_mobile_det, PP-OCRv4_mobile_rec)
✅ High-Performance Inference (HPI): Automatically selects optimal inference backend
- Can reduce latency by 40-73% (e.g., 73.1% reduction on PP-OCRv5_mobile_rec)
- Supports Paddle Inference, OpenVINO, ONNX Runtime, TensorRT
✅ Multi-threaded CPU: Uses all available CPU cores for parallel processing
✅ MKL-DNN enabled: Intel CPU optimization for faster inference
✅ Single image batch: rec_batch_num=1 for lowest latency per image
✅ Auto GPU detection: Automatically uses GPU if available, falls back to CPU
- GPU device selection: Uses first available GPU (gpu_id=0)
- TensorRT support: Automatically enabled via HPI if TensorRT is installed
- GPU memory: Uses default allocation (can be customized if needed)
✅ Automatic image preprocessing: Optimizes images before OCR for better performance
- Automatic downsampling: Resizes large images to maximum 1920px (maintains aspect ratio)
  - Reduces processing time for large images significantly
  - Uses high-quality LANCZOS resampling to preserve text quality
- Image sharpening: Enhances text edges for improved OCR accuracy
  - Uses unsharp mask filter (radius=1, percent=150, threshold=3)
  - Additional sharpening enhancement (factor=1.2)
  - Makes text characters more distinct and easier to recognize
- Format conversion: Automatically converts RGBA, LA, P modes to RGB with white background
- Temporary file management: Automatically cleans up preprocessed images after OCR
✅ Logging disabled: Reduces overhead by disabling verbose logging

GPU Performance:

When GPU is available, HPI automatically selects TensorRT backend for maximum performance
TensorRT can provide 2-3x speedup compared to standard GPU inference
First run with HPI may take longer to build the inference engine, but subsequent runs will be much faster

Requirements:

PaddleOCR >= 2.7.0 with all latest features supported (HPI, MKL-DNN, etc.)
No backward compatibility - requires latest PaddleOCR version
For maximum GPU performance: NVIDIA GPU with CUDA support and TensorRT (optional)
Sufficient GPU memory (typically 1-2GB for mobile models)

Customization Options

--no-fast: Disable fast mode for better accuracy
- Enables textline orientation classification
- Better accuracy on rotated text, but slower
--cpu: Force CPU mode
- Overrides auto GPU detection
- Explicitly use CPU
--gpu: Force GPU mode
- Will fail if GPU not available
- Use when you want to ensure GPU usage
--ocr-version PP-OCRv5: Use better accuracy version
- PP-OCRv5 has better accuracy but slower than PP-OCRv4 (default)
- Uses server models
--max-size <pixels>: Adjust image processing size
- Default: 640px
- Larger values (e.g., 960, 1280) = better accuracy, slower
- Smaller values (e.g., 480) = faster, may reduce accuracy
--hpi: High-Performance Inference
- Automatically selects best inference backend (Paddle Inference, OpenVINO, ONNX Runtime, TensorRT)
- Requires HPI dependencies: paddleocr install_hpi_deps cpu/gpu
- Best performance but requires additional setup

Examples

# Basic usage (uses all optimizations by default: fast + PP-OCRv4 + 640px + auto GPU)
uvx --from . paddleocr-md photo.jpg

# Process with custom output
uvx --from . paddleocr-md document.png -o extracted_text.md

# Better accuracy (slower) - disable fast mode and use PP-OCRv5
uvx --from . paddleocr-md image.png --no-fast --ocr-version PP-OCRv5 --max-size 960

# Force CPU mode
uvx --from . paddleocr-md image.png --cpu

# Use High-Performance Inference (requires HPI dependencies)
uvx --from . paddleocr-md image.png --hpi

Output Format

The tool generates a markdown file containing:

Source image path
List of detected text (one per line)

Example output (test_image.png.md):

# OCR Result

**Source Image:** `test_image.png`

---

- HelloPaddleOcR
- 10000C

Testing

Run tests using pytest:

# Install development dependencies
pip install -e ".[dev]"

# Run all tests
pytest

# Run tests with coverage
pytest --cov=paddleocr_cli --cov-report=html

# Run specific test file
pytest tests/test_mcp_server.py

# Run specific test class or function
pytest tests/test_mcp_server.py::TestGetOCR
pytest tests/test_mcp_server.py::TestGetOCR::test_get_ocr_default_language

The test suite includes:

OCR instance initialization and caching
Tool listing and definition
OCR tool calls with various parameters
Language parameter handling
File validation and error handling
Markdown output generation
Edge cases and error scenarios

License

MIT

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Related MCP Servers

OCR MCP Service
Image & Video Processing App Automation
qiao-925
A
license
A
quality
C
maintenance
Enables AI agents to recognize and extract text from images using PaddleOCR, supporting both file paths and base64 input with structured results including confidence scores and text positions.
Last updated 2026-03-15
2
1
MIT
Nanonets MCP Server
Image & Video Processing Multimedia Processing Documentation Access
ArneJanning
F
license
-
quality
D
maintenance
Converts images, PDFs, Word documents, and Excel spreadsheets to structured markdown using Nanonets OCR, with support for tables, LaTeX equations, and complex layouts.
Last updated 2025-07-09
1
MCP-PDF2MD
File Systems Documentation Access App Automation
zicez
A
license
A
quality
D
maintenance
Converts PDF files from local storage or URLs to structured Markdown format using Mistral AI's OCR API, preserving document structure and extracting images.
Last updated 2025-06-27
2
1
MIT
paddleocr-mcp
Image & Video Processing AI & Machine Learning
Nicvank
A
license
-
quality
B
maintenance
A local OCR MCP server that extracts text from images using PP-OCRv6 for fast text extraction and VL-1.6 for document structure analysis, with automatic model routing and GPU detection.
Last updated 2026-06-30
1
MIT

View all related MCP servers

Related MCP Connectors

scrapi
Web scraping for AI agents. Converts URLs to clean, LLM-ready Markdown with anti-bot bypass.
anybrowse
Converts any URL to clean, LLM-ready Markdown using real Chrome browsers
Jina Reader
Jina AI Reader/Search MCP — turn any URL into clean LLM-ready markdown, plus web search.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/trotsky1997/PaddleOCR-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server