How do I use PDF MCP Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@PDF MCP Server summarize pages 1 to 5 of /Users/work/documents/proposal.pdf" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

PDF MCP Server

An MCP server that enables reading PDF file contents, allowing PDF documents to be used as a knowledge base for LLMs.

Features

High-Quality Extraction: Uses marker-pdf (via a Python backend) to extract text with layout awareness and high-fidelity LaTeX equation recognition.
Robust Fallback: Automatically switches to a Node.js-based parser (pdf-parse) if the Python environment is unavailable or fails, ensuring extraction always succeeds (albeit with lower formatting quality).
Smart Filtering: Supports page range extraction to process only relevant sections of large documents.

Installation

Prerequisites

Node.js (v18+)
Python (v3.10+) and pip (for high-quality extraction)

Setup

Install Node.js dependencies:
npm install
Install Python dependencies (Recommended): To enable high-quality extraction (especially for scientific papers with math), install the Python dependencies.
# Create or activate a virtual environment if desired python3 -m pip install -r python/requirements.txt
Note: The first time you run the tool with the Python backend, it will download necessary AI models (OCR, layout analysis, etc.) to a local cache. This download is approximately 3.3GB. Ensure you have a stable internet connection.
Build the server:
npm run build

Usage

Configuration for Claude/MCP Clients

Add this to your MCP settings configuration:

{ "mcpServers": { "pdf-reader": { "command": "node", "args": ["/absolute/path/to/mcpPdf/dist/index.js"], "env": { // Optional: Override where python is found if not in venv or path // "PYTHON_PATH": "/path/to/python" } } } }

Tool: `read_pdf`

Reads and extracts text content from a PDF file.

Inputs:

path (string): Absolute path to the PDF file.
start_page (number, optional): Starting page number (1-based).
end_page (number, optional): Ending page number (1-based).

How it works:

Attempt 1 (Python/Marker): The server tries to run the internal convert.py script.
- If successfully configured, this loads the marker models from the local cache (.cache directory in the project).
- It accurately converts equations to LaTeX and preserves document structure.
Attempt 2 (Fallback): If the Python script fails (e.g., missing dependencies, runtime error), the server catches the error and uses pdf-parse (a native Node.js library).
- This extracts raw text. Equations may appear as linearized text, and layout may be less preserved.

Troubleshooting

Permission Errors: The project is configured to use a local .cache directory for models to avoid system permission issues. If you encounter errors, ensure the project directory is writable.
Slow Performance: The high-quality extraction uses deep learning models. It can be slow on large documents without a GPU. Use the start_page and end_page arguments to extract only what you need.

Install Server

A

security – no known vulnerabilities

F

license - not found

A

quality - confirmed to work

How are these scores calculated?

Resources

Need Help?

Report Issue

Related Servers

Tools

read_pdf

PDF MCP Server

PDF MCP Server

Features

Installation

Prerequisites

Setup

Usage

Configuration for Claude/MCP Clients

Tool: `read_pdf`

Troubleshooting

Resources

Tools

Appeared in Searches

Latest Blog Posts

MCP directory API

PDF MCP Server

Features

Installation

Prerequisites

Setup

Usage

Configuration for Claude/MCP Clients

Tool: read_pdf

Troubleshooting

Resources

Tools

Appeared in Searches

Latest Blog Posts

MCP directory API

Tool: `read_pdf`