Allows development and contributions through GitHub, with the repository available at github.com/shuminghuang/pdf2md-mcp.
Converts PDF files to Markdown format, extracting content using AI sampling. Supports both local file paths and URLs with incremental conversion capabilities.
Supports testing through Pytest, enabling quality assurance for the PDF to Markdown conversion functionality.
PDF2MD MCP Server
An MCP (Model Context Protocol) server that converts PDF files to Markdown format using AI sampling capabilities.
Features
- Convert PDF files to Markdown using AI content extraction
- Support for both local file paths and URLs
- Incremental conversion - resume from where you left off
- Configurable output directory
- Built with FastMCP for high performance
Installation
Usage
As an MCP Server
Start the server:
The server will expose MCP tools for PDF to Markdown conversion.
Available Tools
convert_pdf_to_markdown
Converts a PDF file to Markdown format using AI sampling.
Parameters:
file_path
(string): Local file path or URL to the PDF fileoutput_dir
(string, optional): Output directory for the markdown file. Defaults to the same directory as input file (for local files) or current working directory (for URLs)
Returns:
output_file
: Path to the generated markdown filesummary
: Summary of the conversion taskpages_processed
: Number of pages processed
Requirements
- Python 3.10+
- An MCP-compatible client with AI sampling capabilities
- Network access for URL-based PDF files
Development
Setup
Running Tests
Code Formatting
License
MIT License - see LICENSE file for details.
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
An MCP server that converts PDF files to Markdown format using AI sampling capabilities, supporting both local files and URLs with incremental conversion features.
Related MCP Servers
- AsecurityAlicenseAqualityMCP server for seamless document format conversion using Pandoc, supporting Markdown, HTML, PDF, DOCX (.docx), csv and more.Last updated -1285PythonMIT License
- AsecurityFlicenseAqualityAn MCP server for converting Markdown documents to PDF files.Last updated -11JavaScript
- -securityAlicense-qualityAn MCP server that provides multiple file conversion tools for AI agents, supporting various document and image format conversions including DOCX to PDF, PDF to DOCX, image conversions, Excel to CSV, HTML to PDF, and Markdown to PDF.Last updated -12PythonMIT License
- -securityAlicense-qualityAn MCP server that exports PDF documents to markdown format optimized for LLM processing.Last updated -PythonBSD 3-Clause