MCP PDF Reader

SUMMARY.md•4.72 KiB

# MCP PDF Reader - Project Summary ## Project Completion A fully functional Model Context Protocol (MCP) server has been successfully developed to read and extract information from PDF files. ## 📁 Project Structure ```text mcp-pdf-reader/ ├── src/ │ ├── __tests__/ │ │ └── pdf-tools.test.ts # Unit tests │ ├── index.ts # Main MCP server │ ├── pdf-tools.ts # PDF extraction functions │ └── types.ts # TypeScript type definitions ├── test-files/ # Test PDF files ├── dist/ # Compiled JavaScript ├── LICENSE ├── CHANGELOG.md ├── package.json # Project configuration ├── tsconfig.json # TypeScript configuration ├── jest.config.js # Jest test configuration ├── README.md # Main documentation ├── CLAUDE_CONFIG.md # Claude Desktop setup guide ├── COPILOT_CONFIG.md # GitHub Copilot setup guide ├── example.ts # Usage example └── Makefile ``` ## 🚀 Implemented Features ### 7 MCP Tools: 1. **read_pdf** - Extract all text from a PDF with optional metadata 2. **get_pdf_metadata** - Retrieve document metadata (title, author, dates, etc.) 3. **read_pdf_pages** - Extract text from specific pages or page ranges 4. **search_pdf** - Search for text in PDF with case-sensitivity support 5. **get_pdf_page_count** - Get total number of pages 6. **list_pdf_images** - List all embedded images with metadata 7. **extract_pdf_image** - Extract specific images as Base64-encoded data ### Capabilities: - Full text extraction - Metadata extraction (title, author, creator, dates) - Page-specific reading - Text search with context - Case-sensitive/insensitive search support - Robust error handling - Complete unit tests - Full TypeScript typing - ES Module compilation ## 🧪 Tests - **27 unit tests** covering all main functions - **All tests passing** successfully - **PDF-specific validations** (test-files/sample.pdf) - **Concrete assertions**: page count, metadata, specific text, search results - **Table extraction**: Tests extracting and displaying tables in ASCII format - **Image processing**: Tests listing and extracting images from PDFs - Run: `npm test` - Result: **Test Suites: 1 passed, Tests: 27 passed (3.173 s)** ### Test Details: - **getPageCount** (2 tests): Validation of exact 6-page count - **extractMetadata** (2 tests): Verification of author, creator, producer, and dates - **extractTextFromPDF** (3 tests): Full extraction and specific content validation - **extractPages** (6 tests): Individual pages, page ranges, and error validation - **searchInPDF** (8 tests): Specific term searches, case-sensitivity, context - **Table Extraction** (1 test): Table extraction and ASCII format display - **PDF Images** (5 tests): Listing and extraction of embedded images ## 📦 Technology Stack - **@modelcontextprotocol/sdk** (v1.25.2) - MCP SDK - **pdf-parse** (v2.4.5) - PDF parsing library (text extraction) - **pdf-lib** (v1.17.1) - PDF manipulation library (image extraction) - **TypeScript** (v5.9.3) - Statically-typed language - **Jest** (v29.7.0) - Testing framework - **ts-jest** (v29.2.5) - Jest TypeScript preset ## Scripts ```bash npm run build # Compile TypeScript to JavaScript npm run dev # Development mode (watch) npm start # Start MCP server npm test # Run tests npm run test:watch # Tests in watch mode ``` ## 📝 Project Status ### Successfully Completed: 1. **MCP server implemented** with 7 fully functional tools 2. **Complete test suite** executed with user-provided test PDF 3. **Successful compilation** with no errors or warnings 4. **Complete documentation** in README.md and configuration guides 5. **Fully typed code** with TypeScript ### 🎯 Ready to Use: The server is fully functional and ready to integrate with Claude Desktop or any MCP-compatible client. ## 🎯 Usage with Claude Desktop Once configured, you can ask Claude: - "Read the PDF at C:/path/to/file.pdf" - "Extract metadata from the PDF at..." - "Search for the word 'example' in the PDF..." - "Read pages 1-5 from the PDF..." - "How many pages does the PDF have...?" ## ⚡ Build and Execution The project compiles successfully without errors and is ready for use. ```bash # Install dependencies (already done) npm install # Build (already done) npm run build # Start server npm start ``` ## Important Notes - Does not include OCR (optical character recognition) - Works best with well-formed PDFs - Requires Node.js to run - Compatible with Windows, macOS, and Linux

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rturv/mcp-pdf-reader'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

SUMMARY.md•4.72 KiB