Enables reading and processing PDF files with tools for text extraction, OCR recognition, and image extraction from PDF documents.
Based on FastMCP framework which is hosted on GitHub, allowing users to leverage the MCP protocol implementation for PDF processing.
Requires Python 3.9+ environment to run the server, utilizing Python libraries like PyMuPDF for PDF processing functionality.
📄 MCP PDF Server
A PDF file reading server based on FastMCP.
Supports PDF text extraction, OCR recognition, and image extraction via the MCP protocol, with a built-in web debugger for easy testing.
🚀 Features
read_pdf_text
Extracts normal text from a PDF (page by page).read_by_ocr
Uses OCR to recognize text from scanned or image-based PDFs.read_pdf_images
Extracts all images from a specified PDF page (Base64 encoded output).
Related MCP server: Textin MCP Server
📂 Project Structure
⚙️ Installation
Recommended Python version: 3.9+
Note: To use OCR features, you may need a MuPDF build with OCR support or external OCR libraries.
🔦 Start the Server
Run the following command:
You should see logs like:
🌐 Web Debugging Interface
Open your browser and visit:
Select a tool from the left panel
Fill in parameters on the right panel
Click "Run" to test the tool
No coding required — easily debug and test via the web UI.
🛠️ API Tool List
Tool | Description | Input Parameters | Returns |
| Extracts normal text from PDF pages |
,
,
| List of page texts |
| Recognizes text via OCR |
,
,
,
,
| OCR extracted text |
| Extracts images from a PDF page |
,
| List of images (Base64 encoded) |
📝 Example Usage
Extract text from pages 1 to 5:
Perform OCR recognition on page 1:
Extract all images from page 3:
📢 Notes
Files must be placed inside the
pdf_resources/directory, or an absolute path must be provided.OCR functionality requires appropriate OCR support in the environment.
When processing large files, adjust memory and timeout settings as needed.
📜 License
This project is licensed under the MIT License.
For commercial use, please credit the original source.