Integrates with Google Gemini API to power the intelligent agent functionality in the client application for PDF to Markdown conversion
Uses LangChain to create the reactive agent that orchestrates PDF uploading and conversion operations between MCP servers
Leverages LangGraph alongside LangChain to build the intelligent agent in the client application that interacts with the MCP servers
Converts PDF documents to Markdown format through the dedicated conversion MCP server
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP PDF to Markdown Converterconvert this PDF to markdown"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP PDF to Markdown Converter and Crawler πβ‘οΈπ
This project provides a robust system for converting PDF documents to Markdown format and crawling web content using a Multi-Server Communication Protocol (MCP) architecture. It comprises two main modules: convert_pdf for PDF upload and conversion, and crawl_mcp for web crawling, along with a client application that orchestrates operations using a reactive agent.
Project Structure
The core components of this project are:
convert_pdf: A FastMCP server (running onhttp://127.0.0.1:8001) responsible for handling PDF file uploads and converting them to Markdown. It includes two endpoints:/upload/mcp/upload_pdf_tool: Handles PDF file uploads via multipart form data./mcp: Converts uploaded PDFs to Markdown using theconvert_pdf_to_markdown_tool.
crawl_mcp: A server module for crawling web content. For details on running this module, see src/crawl_mcp/README.md.client: A client application that acts as an intelligent agent. It uses LangChain and LangGraph to interact with the MCP servers, upload PDFs, and trigger conversions or crawling tasks.
Related MCP server: markdown2pdf-mcp
Getting Started π
Follow these steps to set up and run the project:
1. Prerequisites
Python 3.9+
uv: A fast Python package installer and resolver. Install it via
pipif not already present:pip install uv
2. Project Setup
Clone the repository (if applicable) or navigate to your project root.
cd /path/to/your/MCPCreate and Sync Virtual Environment:
uvwill create a.venvdirectory and install all necessary dependencies based on yourpyproject.toml.uv syncActivate the Virtual Environment: This ensures all commands run within your isolated environment.
macOS/Linux:
source .venv/bin/activateWindows (Command Prompt):
.venv\Scripts\activate.batWindows (PowerShell):
.venv\Scripts\Activate.ps1
Create
.envfile: Create a file named.envin the project root (MCP/) and add your Google Gemini API key:GEMINI_API_KEY_2="YOUR_GEMINI_API_KEY_HERE"Replace
"YOUR_GEMINI_API_KEY_HERE"with your actual API key.
3. Running the Modules
Each module has its own setup and running instructions. Refer to the module-specific READMEs for details:
Convert PDF Module: See src/convert_pdf/README.md for instructions on running the
convert_pdfserver.Crawl MCP Module: See src/crawl_mcp/README.md for instructions on running the
crawl_mcpserver.
4. Docker
The convert_pdf module can be run using Docker Compose with a single service:
Service:
mcp-convert-server(port 8001)Functionality: Handles PDF uploads and conversion to Markdown.
To run:
cd src/convert_pdf
docker-compose up --build -dFor crawl_mcp Docker instructions, refer to src/crawl_mcp/README.md.
5. Testing with Client
To test the modules, use the client application located in src/client/. Ensure the relevant servers are running, then execute:
uv run python src/client/*For example, to test the convert_pdf module, ensure a PDF file (e.g., input/sample.pdf) exists in the projectβs input directory and run:
uv run python src/client/test_client.pyFor testing crawl_mcp, refer to its README for specific client instructions.
6. Directory Structure
MCP/
βββ src/
β βββ convert_pdf/
β β βββ README.md
β β βββ src/
β β β βββ __init__.py
β β β βββ convert_mcp.py
β β β βββ pdf2md.py
β β β βββ upload_api.py
β β βββ uploaded/
β β βββ output/
β β βββ processed_files.json
β β βββ docker-compose.yml
β β βββ Dockerfile
β β βββ pyproject.toml
β β βββ uv.lock
β βββ crawl_mcp/
β β βββ README.md
β β βββ (other module files)
β βββ client/
β β βββ test_client.py
β β βββ (other client scripts)
βββ .env
βββ README.mdNotes
Ensure the
.envfile is correctly configured with your API key.The
convert_pdfmodule handles both upload and conversion on port 8001, consolidating functionality for efficiency.For detailed module configurations, refer to the respective READMEs.
If encountering issues (e.g.,
ClientDisconnector import errors), check logs with:docker-compose logs mcp-convert-server
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.