Skip to main content
Glama

mcp-pdf-tools

MCP server for extracting text, searching, and analyzing PDF files

npm version License: MIT TypeScript Node

Give Claude (or any MCP client) the ability to read, search, and analyze PDF documents.


What is this?

mcp-pdf-tools is a Model Context Protocol server that gives AI assistants the ability to work with PDF files. Point it at any text-based PDF and your assistant can extract content, search for specific text, pull metadata, and analyze word usage — all without leaving the conversation.

Quick Start

Claude Desktop

Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):

{ "mcpServers": { "pdf-tools": { "command": "npx", "args": ["-y", "mcp-pdf-tools"] } } }

Claude Code

claude mcp add pdf-tools npx mcp-pdf-tools

Other MCP Clients

npx -y mcp-pdf-tools

The server communicates over stdio using the MCP protocol.

Tools

pdf_info

Get metadata and statistics about a PDF file.

Parameter

Type

Required

Description

file_path

string

Yes

Absolute path to the PDF file

Returns: Title, author, page count, text length, creator, and producer information.


pdf_extract_text

Extract all text content from a PDF file.

Parameter

Type

Required

Default

Description

file_path

string

Yes

Absolute path to the PDF file

max_chars

number

No

50000

Maximum characters to return (truncates with notice)

Returns: Full text content of the PDF, prefixed with page count.


pdf_extract_pages

Extract text from a specific page range.

Parameter

Type

Required

Description

file_path

string

Yes

Absolute path to the PDF file

start_page

number

Yes

Start page (1-indexed)

end_page

number

Yes

End page (inclusive)

Returns: Text content from the specified page range.


Search for text within a PDF file, returning matches with surrounding context.

Parameter

Type

Required

Default

Description

file_path

string

Yes

Absolute path to the PDF file

query

string

Yes

Text to search for (case-insensitive)

max_results

number

No

20

Maximum number of matches to return

Returns: List of matches with line numbers and surrounding context lines.


pdf_word_stats

Get word count and top word frequencies from a PDF.

Parameter

Type

Required

Default

Description

file_path

string

Yes

Absolute path to the PDF file

top_n

number

No

20

Number of top words to include

Returns: Total word count, page count, and a ranked list of the most frequent words (3+ characters).

Example Conversations

Summarizing a report

You: Summarize the key points in /documents/quarterly-report.pdf

Claude: (uses

This is a 24-page quarterly report covering Q4 2025. The key points are...

Searching a contract

You: Does the NDA in /legal/nda-acme.pdf mention anything about a non-compete?

Claude: (uses

Yes — I found 3 mentions of "non-compete" in the document. On line 47, there's a clause stating...

Analyzing word usage

You: What are the most discussed topics in /research/paper.pdf?

Claude: (uses

The paper is 8,400 words across 12 pages. The most frequent terms are "neural" (47 occurrences), "training" (38), and "optimization" (29), suggesting the paper focuses heavily on...

Limitations

Be aware of these current constraints:

  • Text-based PDFs only — Scanned or image-based PDFs will return empty text. No OCR support (yet).

  • Page extraction is approximate — Page boundaries are detected heuristically. Extracted page ranges may not align perfectly with the visual pages in your PDF viewer.

  • No table extraction — Tabular data in PDFs may not preserve its structure in the extracted text.

  • Full file loaded into memory — Very large PDFs may be slow to process.

  • No merge or split — This tool reads PDFs; it does not modify, merge, or split them.

Development

git clone https://github.com/seraphinederenouard/mcp-pdf-tools.git cd mcp-pdf-tools npm install npm run build npm test

License

MIT

Install Server
A
security – no known vulnerabilities
A
license - permissive license
A
quality - confirmed to work

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/seraphinerenard/mcp-pdf-tools'

If you have feedback or need assistance with the MCP directory API, please join our Discord server