PDF Reader MCP Server
The PDF Reader MCP Server allows AI agents to securely read and extract data from PDF files.
Capabilities:
Extract full text content from PDFs
Extract text from specific pages or page ranges
Retrieve PDF metadata (author, title, creation date, etc.)
Get the total page count of a PDF
Process multiple PDF sources (local paths or URLs) in a single request
Operate securely within the defined project root directory
Provide structured JSON output for easy parsing by AI agents
Be installed and run via npm (npx) or Docker
Integrates with Codecov for code coverage reporting, as indicated by the badge showing coverage statistics for the project.
Provides Docker container deployment option, allowing users to run the PDF reader MCP server in an isolated environment with project directory mounting.
Integrates with GitHub for CI/CD pipeline execution, issue tracking, and repository management for the PDF reader MCP server.
Publishes to npm registry allowing installation via npm, with version tracking displayed through npm badge.
Future plans include PWA support for the documentation site, enabling offline access and mobile optimization.
Uses Vitest for performance benchmarking, measuring operations per second for various PDF processing scenarios.
Leverages Zod for input validation, ensuring that requests to the PDF reader MCP server are properly formatted and validated.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@PDF Reader MCP Serverextract text from the quarterly report PDF in the reports folder"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
📄 @sylphx/pdf-reader-mcp
The PDF intelligence layer for AI agents that need source evidence, not just extracted text.
V3 smart tool surface · Agent Document Twin · Evidence-first extraction · Visual crops · OCR adapters · Tables, charts, formulas, figures · Trust & accessibility reports · Benchmark-gated releases
PDFs are not plain text files. They are layout, pixels, tables, hidden text, permissions, annotations, scanned pages, and ambiguous reading order.
PDF Reader MCP turns that mess into an Agent Document Twin: a linked, source-backed representation of the PDF that agents can inspect, search, verify, crop, OCR, enrich, cite, and read with confidence.
If your agent has ever hallucinated from a PDF, lost a table, trusted hidden text, missed a scanned page, or needed to cite the exact region that proves an answer, this is the MCP server for that workflow.
Why Agents Use It
Need | What PDF Reader MCP gives you |
Read the document | Markdown, JSON, HTML, page text, metadata, chunks, and semantic AST. |
Prove the answer | Page numbers, bounding boxes, evidence IDs, region crops, and source renders. |
Handle scanned PDFs | Rendered pages routed through configured OCR providers with word boxes and provenance. |
Recover tables | Selectable-text and OCR-derived tables with cells, geometry, confidence, warnings, and continuation hints. |
See what text extraction misses | Visual page evidence, focused crops, and configured visual-region provider adapters. |
Protect the agent | Trust reports for hidden text, prompt-injection-like content, visual spoofing, unsafe links, and redaction. |
Route accessibility work | Tagged-PDF coverage, tag-visible coverage, headings, images, forms, links, permissions, and page grades. |
Ship with proof | CI, package smoke, deterministic quality benchmarks, provider artifacts, and release gates. |
Related MCP server: PDF Reader MCP Server
Quick Start
Claude Code
claude mcp add pdf-reader -- npx @sylphx/pdf-reader-mcpClaude Desktop
Add this to claude_desktop_config.json:
{
"mcpServers": {
"pdf-reader": {
"command": "npx",
"args": ["@sylphx/pdf-reader-mcp"]
}
}
}Any MCP Client
npx @sylphx/pdf-reader-mcpNode.js >=22.13 is required. The default package works without downloading
OCR models, vision models, Ollama, LM Studio, llama.cpp, or cloud credentials.
Need Cursor, VS Code, Windsurf, Cline, Warp, HTTP transport, Docker, or filesystem sandboxing? See the installation guide.
One Smart Tool First
The default V3 agent path is one tool call:
{
"sources": [{ "path": "/absolute/path/to/report.pdf" }]
}With no manual include_* flags, read_pdf profiles each PDF, chooses the
extraction route, and returns the Agent Document Twin in one response. Digital
text PDFs get Markdown, chunks, tables, layout routing, and source evidence.
Mixed or scanned PDFs are routed toward configured OCR and visual providers
when those providers are ready. Metadata, page geometry, warnings, provider
readiness, and the selected read_pdf arguments are included so the agent can
see what happened.
Agents can still force auto: false and use explicit include_* options for a
precise manual extraction. Use auto_detail: "fast", "balanced", or
"full" when the agent wants to control output depth without learning dozens
of switches.
MCP Tool Surface
Tool | Use it when the agent needs to... |
| Use first. With only |
| Search selectable text and optional OCR text with snippets, offsets, boxes, and provenance. |
| One focused evidence tool for |
Full request and response details live in the API reference.
Agent Document Twin
The Agent Document Twin is the main reason to use this project instead of a plain text extractor. It keeps the document readable by agents while preserving the evidence needed to verify the answer.
Layer | Output |
Lossless PDF layer | Text runs, lines, words, characters, fonts, transforms, page geometry, metadata coverage, outlines, forms, attachments, annotations, permissions, and structure signals where available. |
Visual layer | Page renders, region crops, crop provenance, visual candidates, OCR source renders, and provider-normalized visual evidence. |
Semantic layer | Page, section, paragraph, list, caption, header, footer, table, image, chart, formula, figure, and diagram nodes where available. |
Evidence layer | Stable IDs, page ranges, bounding boxes, crop IDs, confidence, warnings, and extraction method provenance. |
Agent layer | Markdown, JSON, HTML, citation chunks, routing plans, trust report, accessibility report, and document map indexes. |
Example: Read With Evidence
{
"sources": [{ "path": "/absolute/path/to/report.pdf" }],
"include_markdown": true,
"include_chunks": true,
"include_tables": true,
"include_text_layer": true,
"include_document_map": true,
"include_document_ast": true,
"include_trust_report": true,
"include_accessibility_report": true
}Example: Search, Then Verify The Source Region
{
"sources": [{ "path": "/absolute/path/to/report.pdf" }],
"query": "revenue recognition",
"max_matches_per_source": 10
}Use the returned page and bounding box with pdf_evidence operation
render_page or extract_regions when the agent needs visual proof before
citing or summarizing.
Provider-Enabled Intelligence
The default package stays TypeScript-first and local-first. Heavy engines are optional, deployment-controlled adapters.
Capability | Default behavior | Enable with |
Selectable-text PDFs | Works out of the box | No extra dependency |
Rendering and crops | Works out of the box | No extra dependency |
Trust and accessibility reports | Works out of the box | No extra dependency |
OCR for scanned pages | Provider-ready |
|
Visual table/chart/formula/figure/image enrichment | Provider-ready |
|
Supported visual provider paths include local commands, local HTTP servers, Ollama, OpenAI-compatible endpoints, LM Studio, and llama.cpp. Request payloads cannot choose arbitrary executables or arbitrary provider URLs; providers are configured by the deployment environment.
# Example shape only. Point these at your own local OCR command.
export MCP_PDF_OCR_COMMAND="tesseract"
export MCP_PDF_OCR_ARGS_JSON='["{input}", "stdout", "tsv"]'See the guide and API reference for provider configuration details.
Release Proof
Strong README claims should be backed by shipped evidence. This repo publishes machine-readable artifacts and gates releases on them.
Artifact | Current proof |
|
|
| score |
| strict provider evidence enabled, 4/4 final-bar provider profiles certified |
| corpus-style PDF intelligence assertions with capability summaries |
| deterministic crop-substrate proof for provider-manifest regions |
| deterministic scoring proof for table, formula, chart, figure, and image regions |
Run the same proof locally:
bun run benchmark:release-artifacts
bun run benchmark:release-gate
bun run package:smokeSee performance and release evidence for the full benchmark contract.
Output Formats
read_pdf can return the same PDF in several agent-friendly forms:
Plain text and page text
Markdown for RAG and summarization
HTML for rendering or downstream transformation
Structured elements with page and geometry provenance
Document AST for semantic navigation
Citation chunks with page, element, table, and bbox references
Tables with rows, cells, geometry, warnings, and confidence
Trust and accessibility reports
Agent Document Twin indexes linking text, visual, OCR, table, trust, and accessibility evidence
Security Model
PDFs can contain hostile or misleading content. The server treats extraction as an evidence workflow, not as a trusted text dump.
Local-first by default.
URL loading is guarded by host, private-IP, size, and HTTP policy controls.
OCR and visual providers are configured by environment, not by request body.
Trust reports surface hidden text, near-invisible geometry, off-page text, overlapping text, unsafe links, redaction signals, and prompt-injection-like content.
Rendering, crops, OCR, and visual enrichment preserve provenance so agents can route weak evidence to verification instead of silently trusting it.
Documentation
Topic | Link |
Getting started | |
Installation and clients | |
API reference | |
Capability overview | |
Architecture and design | |
Performance and release proof |
Development
git clone https://github.com/SylphxAI/pdf-reader-mcp.git
cd pdf-reader-mcp
bun install
bun run build
bun testUseful checks:
bun run check
bun run typecheck
bun run docs:build
bun run package:smoke
bun run benchmark:release-gateSupport
If you want local-first, evidence-backed PDF intelligence to keep improving for AI agents, star the repo. It helps the project reach more builders who need PDFs to be verifiable, not just readable.
License
MIT © SylphxAI
Star History
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Tools
Latest Blog Posts
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/SylphxAI/pdf-reader-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server