Skip to main content
Glama

PDF Reader MCP Server

by sylphxltd
CHANGELOG.md12.2 kB
# Changelog All notable changes to this project will be documented in this file. See [standard-version](https://github.com/conventional-changelog/standard-version) for commit guidelines. ## [1.2.0](https://github.com/sylphxltd/pdf-reader-mcp/compare/v1.1.0...v1.2.0) (2025-10-31) ### Features * **Content Ordering**: Preserve exact text and image order based on Y-coordinates - Content items within each page are now sorted by their vertical position - Enables AI to see content in the same order as it appears in the PDF - Text and images are interleaved based on document layout - Example: page 1 [text, image, text, image, image, text] - Uses PDF.js transform matrices to extract Y-coordinates - Automatically groups text items on the same line - Returns ordered content parts for optimal AI consumption ### Internal Changes * New `extractPageContent()` function combines text and image extraction with positioning * New `PageContentItem` interface tracks content type, position, and data * Handler updated to generate content parts in document-reading order * Improved error handling to return descriptive error messages as text content ### Code Quality * All tests passing (91 tests) * Coverage maintained at 97.76% statements, 90.95% branches * TypeScript strict mode compliance * Zero linting errors ## [1.1.0](https://github.com/sylphxltd/pdf-reader-mcp/compare/v1.0.0...v1.1.0) (2025-10-31) ### Features * **Image Extraction**: Extract embedded images from PDF pages as base64-encoded data ([bd637f3](https://github.com/sylphxltd/pdf-reader-mcp/commit/bd637f3)) - Support for RGB, RGBA, and Grayscale formats - Works with JPEG, PNG, and other embedded image types - Includes image metadata (width, height, format, page number) - Optional parameter `include_images` (default: false) - Uses PDF.js operator list API for reliable extraction ### Performance Improvements * **Parallel Page Processing**: Process multiple pages concurrently for 5-10x speedup ([e5f85e1](https://github.com/sylphxltd/pdf-reader-mcp/commit/e5f85e1)) - Refactored extractPageTexts to use Promise.all - 10-page PDF: ~5-8x faster - 50-page PDF: ~10x faster - Maintains error isolation per page ### Code Quality * **Deep Architectural Refactoring**: Break down monolithic handler into focused modules ([1519fe0](https://github.com/sylphxltd/pdf-reader-mcp/commit/1519fe0)) - handlers/readPdf.ts: 454 → 143 lines (-68% reduction) - NEW src/types/pdf.ts: Type definitions (44 lines) - NEW src/schemas/readPdf.ts: Zod schemas (61 lines) - NEW src/pdf/parser.ts: Page range parsing (124 lines) - NEW src/pdf/loader.ts: Document loading (74 lines) - NEW src/pdf/extractor.ts: Text & metadata extraction (96 lines → 224 lines with images) - Single Responsibility Principle applied throughout - Functional composition for better testability * **Comprehensive Test Coverage**: 90 tests with 98.94% coverage ([85cf712](https://github.com/sylphxltd/pdf-reader-mcp/commit/85cf712)) - NEW test/pdf/extractor.test.ts (22 tests) - NEW test/pdf/loader.test.ts (9 tests) - NEW test/pdf/parser.test.ts (26 tests) - Tests: 31 → 90 (+158% increase) - Coverage: 90.26% → 98.94% statements - Coverage: 78.64% → 93.33% branches ### Documentation * Enhanced README with image extraction examples and usage guide * Added dedicated Image Extraction section with format details * Updated roadmap to reflect completed features * Clarified image format support and considerations ## [1.0.0](https://github.com/sylphxltd/pdf-reader-mcp/compare/v0.3.24...v1.0.0) (2025-10-31) ### ⚠ BREAKING CHANGES * **Package renamed from @sylphlab/pdf-reader-mcp to @sylphx/pdf-reader-mcp** * Docker images renamed from sylphlab/pdf-reader-mcp to sylphx/pdf-reader-mcp ### Features * Migrate from ESLint/Prettier to Biome for 50x faster linting ([bde79bf](https://github.com/sylphxltd/pdf-reader-mcp/commit/bde79bf)) * Add Docker and Smithery deployment support ([11dc08f](https://github.com/sylphxltd/pdf-reader-mcp/commit/11dc08f)) ### Bug Fixes * Fix Buffer to Uint8Array conversion for PDF.js v5.x compatibility ([1c7710d](https://github.com/sylphxltd/pdf-reader-mcp/commit/1c7710d)) * Fix schema validation with exclusiveMinimum for Mistral/Windsurf compatibility ([1c7710d](https://github.com/sylphxltd/pdf-reader-mcp/commit/1c7710d)) * Fix metadata extraction with robust .getAll() fallback ([1c7710d](https://github.com/sylphxltd/pdf-reader-mcp/commit/1c7710d)) * Fix nested test case that was not running ([2c8e1a5](https://github.com/sylphxltd/pdf-reader-mcp/commit/2c8e1a5)) * Update PdfSourceResult type for exactOptionalPropertyTypes compatibility ([4e0d81d](https://github.com/sylphxltd/pdf-reader-mcp/commit/4e0d81d)) ### Improvements * Upgrade all dependencies to latest versions ([dab3f13](https://github.com/sylphxltd/pdf-reader-mcp/commit/dab3f13)) - @modelcontextprotocol/sdk: 1.8.0 → 1.20.2 - pdfjs-dist: 5.1.91 → 5.4.296 - All GitHub Actions updated to latest versions * Rebrand from Sylphlab to Sylphx ([1b6e4d3](https://github.com/sylphxltd/pdf-reader-mcp/commit/1b6e4d3)) * Revise README for better clarity and modern structure ([b770b27](https://github.com/sylphxltd/pdf-reader-mcp/commit/b770b27)) ### Migration Guide To migrate from @sylphlab/pdf-reader-mcp to @sylphx/pdf-reader-mcp: 1. Uninstall old package: ```bash npm uninstall @sylphlab/pdf-reader-mcp ``` 2. Install new package: ```bash npm install @sylphx/pdf-reader-mcp ``` 3. Update your MCP configuration to use @sylphx/pdf-reader-mcp 4. If using Docker, update image name to sylphx/pdf-reader-mcp All functionality remains the same. No code changes required. ### [0.3.24](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.23...v0.3.24) (2025-04-07) ### Bug Fixes - enable rootDir and adjust include for correct build structure ([a9985a7](https://github.com/sylphlab/pdf-reader-mcp/commit/a9985a7eed16ed0a189dd1bda7a66feb13aee889)) ### [0.3.23](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.22...v0.3.23) (2025-04-07) ### Bug Fixes - correct executable paths due to missing rootDir ([ed5c150](https://github.com/sylphlab/pdf-reader-mcp/commit/ed5c15012b849211422fbb22fb15d8a2c9415b0b)) ### [0.3.22](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.21...v0.3.22) (2025-04-07) ### [0.3.21](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.20...v0.3.21) (2025-04-07) ### [0.3.20](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.19...v0.3.20) (2025-04-07) ### [0.3.19](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.18...v0.3.19) (2025-04-07) ### [0.3.18](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.17...v0.3.18) (2025-04-07) ### Bug Fixes - **publish:** remove dist from gitignore and fix clean script ([305e259](https://github.com/sylphlab/pdf-reader-mcp/commit/305e259d6492fbc1732607ee8f8344f6e07aa073)) ### [0.3.17](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.16...v0.3.17) (2025-04-07) ### Bug Fixes - **config:** align package.json paths with build output (dist/) ([ab1100d](https://github.com/sylphlab/pdf-reader-mcp/commit/ab1100d771e277705ef99cb745f89687c74a7e13)) ### [0.3.16](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.15...v0.3.16) (2025-04-07) ### [0.3.15](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.14...v0.3.15) (2025-04-07) ### Bug Fixes - Run lint-staged in pre-commit hook ([e96680c](https://github.com/sylphlab/pdf-reader-mcp/commit/e96680c771eb99ba303fdf7ad51da880261e11c1)) ### [0.3.14](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.13...v0.3.14) (2025-04-07) ### [0.3.13](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.12...v0.3.13) (2025-04-07) ### Bug Fixes - **docker:** Install pnpm globally in builder stage ([651d7ae](https://github.com/sylphlab/pdf-reader-mcp/commit/651d7ae06660b97af91c348bc8cc786613232c06)) ### [0.3.11](https://github.com/sylphlab/pdf-reader-mcp/compare/v0.3.10...v0.3.11) (2025-04-07) ### [0.3.10](https://github.com/sylphlab/pdf-reader-mcp/compare/v1.0.0...v0.3.10) (2025-04-07) ### Bug Fixes - address remaining eslint warnings ([a91d313](https://github.com/sylphlab/pdf-reader-mcp/commit/a91d313bec2b843724e62ea6a556d99d5389d6cc)) - resolve eslint errors in tests and scripts ([ffc1bdd](https://github.com/sylphlab/pdf-reader-mcp/commit/ffc1bdd18b972f58e90e12ed2394d2968c5639d9)) ## [1.0.0] - 2025-04-07 ### Added - **Project Alignment:** Aligned project structure, configuration (TypeScript, ESLint, Prettier, Vitest), CI/CD (`.github/workflows/ci.yml`), Git Hooks (Husky, lint-staged, commitlint), and dependency management (Dependabot) with Sylph Lab Playbook guidelines. - **Testing:** Achieved ~95% test coverage using Vitest. - **Benchmarking:** Implemented initial performance benchmarks using Vitest `bench`. - **Documentation:** - Set up documentation website using VitePress. - Created initial content for Guide, Design, Performance, Comparison sections. - Updated `README.md` to follow standard structure. - Added `CONTRIBUTING.md`. - Updated Performance page with initial benchmark results. - Added community links and call-to-action in VitePress config footer. - **Package Manager:** Switched from npm to pnpm. ### Changed - **Dependencies:** Updated various dependencies to align with guidelines and ensure compatibility. - **Configuration:** Refined `tsconfig.json`, `eslint.config.js`, `vitest.config.ts`, `package.json` based on guidelines. - **Project Identity:** Updated scope to `@sylphlab`. ### Fixed - Resolved various configuration issues identified during guideline alignment. - Corrected Markdown parsing errors in initial documentation. - Addressed peer dependency warnings where possible. - **Note:** TypeDoc API generation is currently blocked due to unresolved initialization errors with TypeDoc v0.28.1. ### Removed - Sponsorship related files and badges (`.github/FUNDING.yml`). ## [0.3.9] - 2025-04-05 ### Fixed - Removed artifact download/extract steps from `publish-docker` job in workflow, as Docker build needs the full source context provided by checkout. ## [0.3.8] - 2025-04-05 ### Fixed - Removed duplicate `context: .` entry in `docker/build-push-action` step in `.github/workflows/publish.yml`. ## [0.3.7] - 2025-04-05 ### Fixed - Removed explicit `COPY tsconfig.json ./` from Dockerfile (rely on `COPY . .`). - Explicitly set `context: .` in docker build-push action. ## [0.3.6] - 2025-04-05 ### Fixed - Explicitly added `COPY tsconfig.json ./` before `COPY . .` in Dockerfile to ensure it exists before build step. ## [0.3.5] - 2025-04-05 ### Fixed - Added `RUN ls -la` before build step in Dockerfile to debug `tsconfig.json` not found error. ## [0.3.4] - 2025-04-05 ### Fixed - Explicitly specify `tsconfig.json` path in Dockerfile build step (`RUN ./node_modules/.bin/tsc -p tsconfig.json`) to debug build failure. ## [0.3.3] - 2025-04-05 ### Fixed - Changed Dockerfile build step from `RUN npm run build` to `RUN ./node_modules/.bin/tsc` to debug build failure. ## [0.3.2] - 2025-04-05 ### Fixed - Simplified `build` script in `package.json` to only run `tsc` (removed `chmod`) to debug Docker build failure. ## [0.3.1] - 2025-04-05 ### Fixed - Attempted various fixes for GitHub Actions workflow artifact upload issue (`Error: Provided artifact name input during validation is empty`). Final attempt uses fixed artifact filename in upload/download steps. ## [0.3.0] - 2025-04-05 ### Added - `CHANGELOG.md` file based on Keep a Changelog format. - `LICENSE` file (MIT License). - Improved GitHub Actions workflow (`.github/workflows/publish.yml`): - Triggers on push to `main` branch and version tags (`v*.*.*`). - Conditionally archives build artifacts only on tag pushes. - Conditionally runs `publish-npm` and `publish-docker` jobs only on tag pushes. - Added `create-release` job to automatically create GitHub Releases from tags, using `CHANGELOG.md` for the body. - Added version headers to Memory Bank files (`activeContext.md`, `progress.md`). ### Changed - Bumped version from 0.2.2 to 0.3.0. <!-- Note: Removed [0.4.0-dev] entry as changes are now part of 1.0.0 -->

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sylphxltd/pdf-reader-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server