PDFtotext MCP Server
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@PDFtotext MCP Serverextract text from quarterly_report.pdf"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
PDFtotext MCP Server
A reliable Model Context Protocol (MCP) server for PDF text extraction using the proven pdftotext utility from poppler-utils.
๐ Why This Server?
Unlike other PDF MCP servers that suffer from logging interference, complex dependencies, and reliability issues, pdftotext-mcp is:
โ Actually works - Clean JSON-RPC communication without stdout pollution
โ Reliable - Built on mature
pdftotextfrom poppler-utils (used by millions)โ Lightweight - Minimal dependencies, maximum compatibility
โ Production tested - Successfully tested with Claude Desktop and other MCP clients
โ Feature complete - Page-specific extraction, layout preservation, encoding options
โ Error handling - Comprehensive validation and helpful error messages
Related MCP server: PDF Extraction MCP Server
๐ Features
๐ Extract text from entire PDF documents or specific pages
๐จ Preserve original layout formatting (optional)
๐ค Multiple text encoding support (UTF-8, Latin1, ASCII)
๐ Comprehensive metadata in responses (word count, file info, etc.)
๐ก๏ธ File validation and security checks
โก Fast processing with configurable timeouts
๐ Detailed error reporting with troubleshooting hints
๐ง Prerequisites
You must have pdftotext installed on your system:
Ubuntu/Debian
sudo apt update
sudo apt install poppler-utilsmacOS
brew install popplerWindows
# Using Chocolatey
choco install poppler
# Using Scoop
scoop install popplerVerify Installation
pdftotext -v๐ฆ Installation
Option 1: Global Installation (Recommended)
npm install -g pdftotext-mcpOption 2: Use with npx (No Installation)
npx pdftotext-mcpOption 3: Local Development
git clone https://github.com/jpwebb/pdftotext-mcp.git
cd pdftotext-mcp
npm install
npm startโ๏ธ Configuration
Add to your MCP client configuration:
Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"pdftotext": {
"command": "pdftotext-mcp"
}
}
}Or with npx:
{
"mcpServers": {
"pdftotext": {
"command": "npx",
"args": ["pdftotext-mcp"]
}
}
}Other MCP Clients
The server works with any MCP-compatible client. Use pdftotext-mcp as the command.
๐ฏ Usage
The server provides a single, powerful tool: read_pdf_text
Basic Usage
Extract entire document
{
"tool": "read_pdf_text",
"arguments": {
"path": "./document.pdf"
}
}Extract specific page
{
"tool": "read_pdf_text",
"arguments": {
"path": "./document.pdf",
"page": 2
}
}Preserve layout formatting
{
"tool": "read_pdf_text",
"arguments": {
"path": "./document.pdf",
"layout": true
}
}Custom encoding
{
"tool": "read_pdf_text",
"arguments": {
"path": "./document.pdf",
"encoding": "Latin1"
}
}Response Format
Success Response
{
"success": true,
"file": "document.pdf",
"path": "/absolute/path/to/document.pdf",
"extractedText": "Full text content...",
"pageSpecific": "all",
"layoutPreserved": false,
"encoding": "UTF-8",
"fileSize": 1048576,
"lastModified": "2024-01-15T10:30:00.000Z",
"extractedAt": "2024-01-15T10:35:00.000Z",
"textLength": 5234,
"wordCount": 892
}Error Response
{
"success": false,
"error": "File not found: ./nonexistent.pdf",
"errorType": "FILE_NOT_FOUND",
"file": "./nonexistent.pdf",
"timestamp": "2024-01-15T10:35:00.000Z"
}๐ API Reference
Tool: read_pdf_text
Extracts text content from PDF files using pdftotext.
Parameters
Parameter | Type | Required | Default | Description |
| string | โ | - | Path to PDF file (relative or absolute) |
| number | โ | all pages | Specific page to extract (1-based) |
| boolean | โ |
| Preserve original text layout |
| string | โ |
| Output text encoding |
Supported Encodings
UTF-8(default)Latin1ASCII
Error Types
FILE_NOT_FOUND- PDF file doesn't existPERMISSION_DENIED- Cannot read the fileINVALID_PDF- File is not a valid PDFPDFTOTEXT_ERROR- pdftotext utility errorUNKNOWN_ERROR- Unexpected error
๐ง Troubleshooting
"pdftotext is not available"
Solution: Install poppler-utils (see Prerequisites)
"File not found"
Solutions:
Use absolute paths:
/home/user/document.pdfCheck file exists:
ls -la /path/to/file.pdfVerify MCP server working directory
"Permission denied"
Solutions:
Check file permissions:
chmod 644 document.pdfEnsure directory is readable:
chmod 755 /path/to/directory/
"File is not a valid PDF"
Solutions:
Verify file is actually a PDF:
file document.pdfCheck for file corruption
Try with a different PDF file
MCP Connection Issues
Solutions:
Restart your MCP client completely
Check configuration syntax in config file
Verify
pdftotext-mcpis accessible in PATHCheck MCP client logs for detailed errors
๐งช Testing
# Run tests
npm test
# Run tests with watch mode
npm run test:watch
# Run linter
npm run lint๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Development Setup
git clone https://github.com/jpwebb/pdftotext-mcp.git
cd pdftotext-mcp
npm installRunning Locally
npm startCode Style
This project uses ESLint. Run npm run lint to check code style.
๐ License
MIT - see LICENSE file for details.
๐ Acknowledgments
Built for the Model Context Protocol ecosystem
Uses poppler-utils
pdftotextutilityInspired by the need for reliable PDF processing in MCP environments
๐ Related
Made for the MCP community
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Tools
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/jpwebb/pdftotext-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server