Skip to main content
Glama

convert_pdf_to_markdown

Convert PDF files to Markdown format using AI sampling for easier editing and content reuse. Supports local files and URLs with configurable output locations.

Instructions

Convert a PDF file to Markdown format using AI sampling.

Args: file_path: Local file path or URL to the PDF file output_dir: Optional output directory. Defaults to same directory as input file (for local files) or current working directory (for URLs) Returns: Dictionary containing: - output_file: Path to the generated markdown file - summary: Summary of the conversion task - pages_processed: Number of pages processed

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYes
output_dirNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The main handler function for the 'convert_pdf_to_markdown' tool, decorated with @mcp.tool for registration. It handles PDF input (local or URL), downloads if needed, extracts markdown using pymupdf4llm via PDFToMarkdownConverter, appends to output MD file, and returns status.
    @mcp.tool
    async def convert_pdf_to_markdown(
        file_path: str,
        output_dir: Optional[str] = None,
    ) -> Dict[str, Any]:
        """
        Convert a PDF file to Markdown format using AI sampling.
        
        Args:
            file_path: Local file path or URL to the PDF file
            output_dir: Optional output directory. Defaults to same directory as input file
                       (for local files) or current working directory (for URLs)
        Returns:
            Dictionary containing:
            - output_file: Path to the generated markdown file
            - summary: Summary of the conversion task
            - pages_processed: Number of pages processed
        """
        try:
            # Determine if input is URL or local path
            is_url = file_path.startswith(('http://', 'https://'))
            
            if is_url:
                # Download the PDF first
                download_dir = output_dir or os.getcwd()
                os.makedirs(download_dir, exist_ok=True)
                local_pdf_path = await converter.download_pdf(file_path, download_dir)
                source_description = f"URL: {file_path}"
            else:
                # Check if local file exists
                if not os.path.exists(file_path):
                    return {
                        "error": f"File not found: {file_path}",
                        "output_file": None,
                        "summary": "Failed - file not found",
                        "pages_processed": 0
                    }
                local_pdf_path = file_path
                source_description = f"Local file: {file_path}"
            
            # Generate output path
            output_path = converter.get_output_path(local_pdf_path, output_dir)
            
            # Check for existing content
            last_page = await converter.check_existing_content(output_path)
            start_page = last_page + 1 if last_page > 0 else 1
            
            # Extract content using pymupdf4llm
            extracted_content, pages_processed = await converter.extract_pdf_content(
                local_pdf_path, start_page
            )
            
            # Write or append content
            mode = 'a' if last_page > 0 else 'w'
            async with aiofiles.open(output_path, mode, encoding='utf-8') as f:
                if last_page > 0:
                    await f.write('\n\n' + extracted_content)
                else:
                    await f.write(extracted_content)
            
            # Generate summary
            action = "Continued" if last_page > 0 else "Started"
            summary = f"{action} PDF conversion from {source_description}. " \
                     f"Processed {pages_processed} pages starting from page {start_page}. " \
                     f"Output saved to: {output_path}"
            
            return {
                "output_file": output_path,
                "summary": summary,
                "pages_processed": pages_processed,
                "start_page": start_page,
                "source": source_description,
            }
            
        except Exception as e:
            return {
                "error": f"Conversion failed: {str(e)}",
                "output_file": None,
                "summary": f"Failed to convert PDF: {str(e)}",
                "pages_processed": 0
            }
  • Supporting class with utility methods for downloading PDFs, generating output paths, checking existing content, and extracting markdown from PDF pages using pymupdf4llm.to_markdown.
    class PDFToMarkdownConverter:
        """Handles PDF to Markdown conversion using MCP sampling."""
        
        def __init__(self):
            self.session_cache: Dict[str, Any] = {}
        
        async def download_pdf(self, url: str, output_dir: str) -> str:
            """Download PDF from URL to local file."""
            parsed_url = urlparse(url)
            filename = os.path.basename(parsed_url.path) or "downloaded.pdf"
            if not filename.endswith('.pdf'):
                filename += '.pdf'
            
            local_path = os.path.join(output_dir, filename)
            
            async with httpx.AsyncClient() as client:
                response = await client.get(url)
                response.raise_for_status()
                
                async with aiofiles.open(local_path, 'wb') as f:
                    await f.write(response.content)
            
            return local_path
        
        def get_output_path(self, input_path: str, output_dir: Optional[str] = None) -> str:
            """Generate output markdown file path."""
            input_path_obj = Path(input_path)
            base_name = input_path_obj.stem
            
            if output_dir:
                output_directory = Path(output_dir)
            else:
                output_directory = input_path_obj.parent
            
            output_directory.mkdir(parents=True, exist_ok=True)
            return str(output_directory / f"{base_name}.md")
        
        async def check_existing_content(self, output_path: str) -> int:
            """Check existing markdown content and determine last processed page."""
            if not os.path.exists(output_path):
                return 0
            
            try:
                async with aiofiles.open(output_path, 'r', encoding='utf-8') as f:
                    content = await f.read()
                
                # Look for page markers like "## Page X" or "<!-- Page X -->"
                page_matches = re.findall(r'(?:##\s*Page\s*(\d+)|<!--\s*Page\s*(\d+)\s*-->)', content, re.IGNORECASE)
                if page_matches:
                    # Get the highest page number
                    pages = [int(match[0] or match[1]) for match in page_matches]
                    return max(pages)
                
                return 0
            except Exception:
                return 0
        
        async def extract_pdf_content(self, pdf_path: str, start_page: int = 1) -> Tuple[str, int]:
            """
            Extract PDF content using pymupdf4llm (Python package) instead of MCP sampling.
            """
            try:
                # Use pymupdf4llm to extract markdown from the PDF
                # Note: pages are 0-indexed in pymupdf4llm
                # If start_page > 1, extract only the remaining pages
                import asyncio
                loop = asyncio.get_event_loop()
                def extract_md():
                    if start_page > 1:
                        # Extract only the remaining pages
                        total_pages = pymupdf4llm.get_page_count(pdf_path)
                        pages = list(range(start_page - 1, total_pages))
                        md = pymupdf4llm.to_markdown(pdf_path, pages=pages)
                    else:
                        md = pymupdf4llm.to_markdown(pdf_path)
                    return md
                extracted_content = await loop.run_in_executor(None, extract_md)
    
                # Count the number of pages processed by looking for page markers
                page_matches = re.findall(r'(?:##\s*Page\s*(\d+)|<!--\s*Page\s*(\d+)\s*-->)', extracted_content, re.IGNORECASE)
                if page_matches:
                    pages_processed = len(set(int(match[0] or match[1]) for match in page_matches))
                else:
                    # Fallback: count number of '## Page' headers or estimate from start_page
                    pages_processed = extracted_content.count('## Page') or 1
    
                return extracted_content, pages_processed
            except Exception as e:
                import traceback
                traceback.print_exception(e)
                fallback_content = f"""# PDF Content Extraction Error\n\nFailed to extract content from: {pdf_path}\nError: {str(e)}\n\n<!-- Page {start_page} -->\n## Page {start_page}\n\n*Content extraction failed. Please check the PDF file and try again.*\n\n---\n*PDF2MD MCP Server - Extraction failed, using fallback*\n"""
                return fallback_content, 1
  • Instantiation of the converter and @mcp.tool decorator that registers the convert_pdf_to_markdown function with the FastMCP server.
    converter = PDFToMarkdownConverter()
    
    @mcp.tool
  • Docstring defining the input parameters, their descriptions, and output format for the tool schema.
    """
    Convert a PDF file to Markdown format using AI sampling.
    
    Args:
        file_path: Local file path or URL to the PDF file
        output_dir: Optional output directory. Defaults to same directory as input file
                   (for local files) or current working directory (for URLs)
    Returns:
        Dictionary containing:
        - output_file: Path to the generated markdown file
        - summary: Summary of the conversion task
        - pages_processed: Number of pages processed
    """
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and does well by disclosing key behavioral traits: it specifies the conversion uses 'AI sampling', describes default output directory behavior for local files vs. URLs, and outlines the return structure. However, it misses details like error handling, performance limits, or authentication needs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by structured sections for Args and Returns. Every sentence adds value without waste, making it easy to scan and understand quickly. The formatting enhances readability without verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 params, no annotations, but with output schema), the description is largely complete: it covers purpose, parameters, and return values. The output schema handles return structure, so the description doesn't need to duplicate that. It could improve by mentioning potential errors or limitations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It adds meaningful context for both parameters: 'file_path' clarifies it accepts local paths or URLs, and 'output_dir' explains default behaviors based on input type. This goes beyond the bare schema, though it could detail format constraints (e.g., URL protocols).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action (convert), resource (PDF file), and target format (Markdown) using 'AI sampling'. It distinguishes the tool's purpose with technical detail about the conversion method, making it immediately understandable without redundancy.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for PDF-to-Markdown conversion but provides no explicit guidance on when to use this tool versus alternatives (e.g., other conversion methods or tools). Since there are no sibling tools, the lack of comparative guidance is less critical, but it still doesn't offer context like prerequisites or typical scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gavinHuang/pdf2md-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server