Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@PDFtotext MCP Serverextract text from quarterly_report.pdf"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
PDFtotext MCP Server
A reliable Model Context Protocol (MCP) server for PDF text extraction using the proven pdftotext utility from poppler-utils.
π Why This Server?
Unlike other PDF MCP servers that suffer from logging interference, complex dependencies, and reliability issues, pdftotext-mcp is:
β Actually works - Clean JSON-RPC communication without stdout pollution
β Reliable - Built on mature
pdftotextfrom poppler-utils (used by millions)β Lightweight - Minimal dependencies, maximum compatibility
β Production tested - Successfully tested with Claude Desktop and other MCP clients
β Feature complete - Page-specific extraction, layout preservation, encoding options
β Error handling - Comprehensive validation and helpful error messages
Related MCP server: PDF Extraction MCP Server
π Features
π Extract text from entire PDF documents or specific pages
π¨ Preserve original layout formatting (optional)
π€ Multiple text encoding support (UTF-8, Latin1, ASCII)
π Comprehensive metadata in responses (word count, file info, etc.)
π‘οΈ File validation and security checks
β‘ Fast processing with configurable timeouts
π Detailed error reporting with troubleshooting hints
π§ Prerequisites
You must have pdftotext installed on your system:
Ubuntu/Debian
sudo apt update
sudo apt install poppler-utilsmacOS
brew install popplerWindows
# Using Chocolatey
choco install poppler
# Using Scoop
scoop install popplerVerify Installation
pdftotext -vπ¦ Installation
Option 1: Global Installation (Recommended)
npm install -g pdftotext-mcpOption 2: Use with npx (No Installation)
npx pdftotext-mcpOption 3: Local Development
git clone https://github.com/jpwebb/pdftotext-mcp.git
cd pdftotext-mcp
npm install
npm startβοΈ Configuration
Add to your MCP client configuration:
Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"pdftotext": {
"command": "pdftotext-mcp"
}
}
}Or with npx:
{
"mcpServers": {
"pdftotext": {
"command": "npx",
"args": ["pdftotext-mcp"]
}
}
}Other MCP Clients
The server works with any MCP-compatible client. Use pdftotext-mcp as the command.
π― Usage
The server provides a single, powerful tool: read_pdf_text
Basic Usage
Extract entire document
{
"tool": "read_pdf_text",
"arguments": {
"path": "./document.pdf"
}
}Extract specific page
{
"tool": "read_pdf_text",
"arguments": {
"path": "./document.pdf",
"page": 2
}
}Preserve layout formatting
{
"tool": "read_pdf_text",
"arguments": {
"path": "./document.pdf",
"layout": true
}
}Custom encoding
{
"tool": "read_pdf_text",
"arguments": {
"path": "./document.pdf",
"encoding": "Latin1"
}
}Response Format
Success Response
{
"success": true,
"file": "document.pdf",
"path": "/absolute/path/to/document.pdf",
"extractedText": "Full text content...",
"pageSpecific": "all",
"layoutPreserved": false,
"encoding": "UTF-8",
"fileSize": 1048576,
"lastModified": "2024-01-15T10:30:00.000Z",
"extractedAt": "2024-01-15T10:35:00.000Z",
"textLength": 5234,
"wordCount": 892
}Error Response
{
"success": false,
"error": "File not found: ./nonexistent.pdf",
"errorType": "FILE_NOT_FOUND",
"file": "./nonexistent.pdf",
"timestamp": "2024-01-15T10:35:00.000Z"
}π API Reference
Tool: read_pdf_text
Extracts text content from PDF files using pdftotext.
Parameters
Parameter | Type | Required | Default | Description |
| string | β | - | Path to PDF file (relative or absolute) |
| number | β | all pages | Specific page to extract (1-based) |
| boolean | β |
| Preserve original text layout |
| string | β |
| Output text encoding |
Supported Encodings
UTF-8(default)Latin1ASCII
Error Types
FILE_NOT_FOUND- PDF file doesn't existPERMISSION_DENIED- Cannot read the fileINVALID_PDF- File is not a valid PDFPDFTOTEXT_ERROR- pdftotext utility errorUNKNOWN_ERROR- Unexpected error
π§ Troubleshooting
"pdftotext is not available"
Solution: Install poppler-utils (see Prerequisites)
"File not found"
Solutions:
Use absolute paths:
/home/user/document.pdfCheck file exists:
ls -la /path/to/file.pdfVerify MCP server working directory
"Permission denied"
Solutions:
Check file permissions:
chmod 644 document.pdfEnsure directory is readable:
chmod 755 /path/to/directory/
"File is not a valid PDF"
Solutions:
Verify file is actually a PDF:
file document.pdfCheck for file corruption
Try with a different PDF file
MCP Connection Issues
Solutions:
Restart your MCP client completely
Check configuration syntax in config file
Verify
pdftotext-mcpis accessible in PATHCheck MCP client logs for detailed errors
π§ͺ Testing
# Run tests
npm test
# Run tests with watch mode
npm run test:watch
# Run linter
npm run lintπ€ Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Development Setup
git clone https://github.com/jpwebb/pdftotext-mcp.git
cd pdftotext-mcp
npm installRunning Locally
npm startCode Style
This project uses ESLint. Run npm run lint to check code style.
π License
MIT - see LICENSE file for details.
π Acknowledgments
Built for the Model Context Protocol ecosystem
Uses poppler-utils
pdftotextutilityInspired by the need for reliable PDF processing in MCP environments
π Related
Made for the MCP community