Skip to main content
Glama
cordlesssteve

Document Organizer MCP Server

convert_pdf

Convert PDF files to Markdown format for better organization and processing using marker or pymupdf4llm libraries.

Instructions

Convert PDF files to Markdown format using marker (recommended) or pymupdf4llm

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
output_pathNoOptional path to write markdown output
pdf_pathYesAbsolute path to the PDF file to convert

Implementation Reference

  • Core handler function that orchestrates PDF to Markdown conversion, selecting between marker or pymupdf4llm engines based on options, with input validation and comprehensive error handling.
    async function convertPdfToMarkdown(
      pdfPath: string,
      outputPath?: string,
      options: {
        engine?: "pymupdf4llm" | "marker";
        page_chunks?: boolean;
        write_images?: boolean;
        image_path?: string;
        table_strategy?: "fast" | "accurate";
        extract_content?: "text" | "figures" | "both";
        auto_clean?: boolean;
      } = {}
    ): Promise<ConversionResult> {
      const startTime = Date.now();
      
      try {
        // Validate PDF exists
        const validatedPdfPath = validatePath(pdfPath);
        await fs.access(validatedPdfPath);
        
        // Determine which engine to use
        const engine = options.engine || "marker";
        
        if (engine === "marker") {
          return await convertWithMarker(validatedPdfPath, outputPath, options);
        } else {
          return await convertWithPymupdf4llm(validatedPdfPath, outputPath, options);
        }
      } catch (error) {
        return {
          success: false,
          error: error instanceof Error ? error.message : String(error),
          page_count: 0,
          char_count: 0,
          images_extracted: 0,
          processing_time: Date.now() - startTime,
          memory_used: 0,
          warnings: []
        };
      }
    }
  • Zod schema defining the input parameters for the convert_pdf tool, including pdf_path, optional output_path, and detailed conversion options.
    const ConvertPdfArgsSchema = z.object({
      pdf_path: z.string().describe("Absolute path to the PDF file to convert"),
      output_path: z.string().optional().describe("Optional path to write markdown output. If not provided, returns content directly"),
      options: z.object({
        engine: z.enum(["pymupdf4llm", "marker"]).optional().default("marker").describe("PDF conversion engine (marker recommended for complex documents)"),
        page_chunks: z.boolean().optional().default(false).describe("Process as individual pages for memory efficiency (pymupdf4llm only)"),
        write_images: z.boolean().optional().default(false).describe("Extract embedded images to files"),
        image_path: z.string().optional().describe("Directory for extracted images (requires write_images: true)"),
        table_strategy: z.enum(["fast", "accurate"]).optional().default("accurate").describe("Table extraction strategy (pymupdf4llm only)"),
        extract_content: z.enum(["text", "figures", "both"]).optional().default("both").describe("Content to extract from PDF (pymupdf4llm only)"),
        auto_clean: z.boolean().optional().default(true).describe("Automatically clean marker formatting artifacts")
      }).optional().default({})
    });
  • src/index.ts:1291-1295 (registration)
    Tool registration object defining the 'convert_pdf' tool's name, description, and input schema in the tools array.
      name: "convert_pdf",
      description: "Convert PDF files to Markdown format using marker (recommended) or pymupdf4llm. Marker provides superior quality for complex documents with tables and structured content, with automatic cleaning of formatting artifacts. Returns detailed conversion statistics including processing time and content metrics.",
      inputSchema: zodToJsonSchema(ConvertPdfArgsSchema) as ToolInput,
    },
    {
  • Helper function implementing PDF conversion specifically using the 'marker' engine, including subprocess execution, temporary directory management, automatic cleaning, and result processing.
    async function convertWithMarker(
      pdfPath: string,
      outputPath?: string,
      options: { auto_clean?: boolean } = {}
    ): Promise<ConversionResult> {
      const startTime = Date.now();
      
      try {
        // Check if marker is available
        const markerCheck = await checkMarker();
        if (!markerCheck.available) {
          throw new Error(`Marker not available: ${markerCheck.error}`);
        }
        
        // Create temporary output directory for marker
        const tempDir = await fs.mkdtemp(path.join(os.tmpdir(), 'marker-'));
        
        // Run marker conversion with correct syntax
        const result = await new Promise<{ success: boolean; content?: string; error?: string }>((resolve) => {
          const markerProcess = spawn('marker_single', [pdfPath, '--output_dir', tempDir], {
            stdio: ['pipe', 'pipe', 'pipe']
          });
    
          let stdout = '';
          let stderr = '';
    
          markerProcess.stdout.on('data', (data: Buffer) => {
            stdout += data.toString();
          });
    
          markerProcess.stderr.on('data', (data: Buffer) => {
            stderr += data.toString();
          });
    
          markerProcess.on('close', async (code: number | null) => {
            try {
              if (code === 0) {
                // Find the markdown file that marker created
                // Marker creates a directory with PDF basename, then puts .md file inside
                const pdfBaseName = path.basename(pdfPath, '.pdf');
                const markerDir = path.join(tempDir, pdfBaseName);
                const expectedOutput = path.join(markerDir, `${pdfBaseName}.md`);
                
                // Check if the expected structure exists
                let outputFile = expectedOutput;
                try {
                  await fs.access(expectedOutput);
                } catch {
                  // Look for marker directory and .md file inside it
                  try {
                    const tempFiles = await fs.readdir(tempDir, { withFileTypes: true });
                    const markerDirEntry = tempFiles.find(entry => entry.isDirectory());
                    
                    if (markerDirEntry) {
                      const dirPath = path.join(tempDir, markerDirEntry.name);
                      const dirFiles = await fs.readdir(dirPath);
                      const mdFile = dirFiles.find(file => file.endsWith('.md'));
                      if (mdFile) {
                        outputFile = path.join(dirPath, mdFile);
                      } else {
                        resolve({ success: false, error: 'No markdown file found in marker directory' });
                        return;
                      }
                    } else {
                      resolve({ success: false, error: 'No marker output directory found' });
                      return;
                    }
                  } catch {
                    resolve({ success: false, error: 'Failed to find marker output' });
                    return;
                  }
                }
                
                // Read the converted content
                const content = await fs.readFile(outputFile, 'utf-8');
                resolve({ success: true, content });
              } else {
                resolve({ success: false, error: stderr || `Marker exited with code ${code}` });
              }
            } catch (readError) {
              resolve({ success: false, error: `Failed to read marker output: ${readError}` });
            } finally {
              // Clean up temp directory
              try {
                await fs.rm(tempDir, { recursive: true });
              } catch {}
            }
          });
    
          markerProcess.on('error', (error: Error) => {
            resolve({ success: false, error: error.message });
          });
        });
    
        if (!result.success || !result.content) {
          throw new Error(result.error || 'Marker conversion failed');
        }
        
        // Apply cleaning if enabled (default: true)
        let finalContent = result.content;
        if (options.auto_clean !== false) {
          finalContent = cleanMarkerOutput(result.content);
        }
        
        // Write to output file if specified
        if (outputPath) {
          await fs.writeFile(outputPath, finalContent, 'utf-8');
        }
        
        // Calculate statistics
        const charCount = finalContent.length;
        const pageCount = Math.max(1, Math.floor(charCount / 3000)); // Rough estimate
        
        return {
          success: true,
          markdown_content: finalContent,
          output_file: outputPath,
          page_count: pageCount,
          char_count: charCount,
          images_extracted: 0, // Marker doesn't extract images to separate files
          processing_time: Date.now() - startTime,
          memory_used: 0, // Would need process monitoring
          warnings: options.auto_clean !== false ? ['Content automatically cleaned (table-aware)'] : []
        };
        
      } catch (error) {
        return {
          success: false,
          error: error instanceof Error ? error.message : String(error),
          page_count: 0,
          char_count: 0,
          images_extracted: 0,
          processing_time: Date.now() - startTime,
          memory_used: 0,
          warnings: []
        };
      }
    }
  • Helper function implementing PDF conversion using pymupdf4llm via dynamically generated Python script execution, supporting advanced options like image extraction and page chunking.
    async function convertWithPymupdf4llm(
      pdfPath: string,
      outputPath?: string,
      options: {
        page_chunks?: boolean;
        write_images?: boolean;
        image_path?: string;
        table_strategy?: "fast" | "accurate";
        extract_content?: "text" | "figures" | "both";
      } = {}
    ): Promise<ConversionResult> {
      const startTime = Date.now();
      
      try {
        // Validate PDF exists
        const validatedPdfPath = validatePath(pdfPath);
        await fs.access(validatedPdfPath);
        
        // Build Python conversion script for pymupdf4llm
        const pythonScript = `
    import pymupdf4llm
    import json
    import sys
    import os
    import gc
    import psutil
    
    def get_memory_usage():
        process = psutil.Process(os.getpid())
        return process.memory_info().rss / 1024 / 1024  # MB
    
    # Get initial memory
    initial_memory = get_memory_usage()
    
    try:
        # Set up conversion options
        kwargs = {}
        ${options.page_chunks ? "kwargs['page_chunks'] = True" : ""}
        ${options.write_images ? "kwargs['write_images'] = True" : ""}
        ${options.image_path ? `kwargs['image_path'] = "${options.image_path}"` : ""}
        
        # Convert PDF to markdown
        pdf_path = "${validatedPdfPath.replace(/\\/g, '\\\\\\\\')}"
        md_content = pymupdf4llm.to_markdown(pdf_path, **kwargs)
        
        # Get memory usage after conversion
        peak_memory = get_memory_usage()
        
        # Calculate statistics
        char_count = len(md_content)
        page_count = 0  # pymupdf4llm doesn't directly expose page count
        images_extracted = 0  # Would need to count files in image_path if provided
        
        # Estimate page count from content (rough heuristic)
        if isinstance(md_content, list):
            page_count = len(md_content)
            md_content = "\\n\\n---\\n\\n".join(md_content)
        else:
            # Estimate based on typical PDF page length
            page_count = max(1, char_count // 3000)
        
        # Count extracted images if image_path was provided
        ${options.write_images && options.image_path ? `
        if os.path.exists("${options.image_path}"):
            images_extracted = len([f for f in os.listdir("${options.image_path}") if f.lower().endswith(('.png', '.jpg', '.jpeg', '.gif'))])
        ` : ""}
        
        result = {
            "success": True,
            "markdown_content": md_content,
            "page_count": page_count,
            "char_count": char_count,
            "images_extracted": images_extracted,
            "memory_used": peak_memory - initial_memory,
            "warnings": []
        }
        
        # Write to file if output_path provided
        ${outputPath ? `
        output_path = "${validatePath(outputPath).replace(/\\/g, '\\\\\\\\')}"
        with open(output_path, 'w', encoding='utf-8') as f:
            f.write(md_content)
        result["output_file"] = output_path
        ` : ""}
        
        print(json.dumps(result))
        
    except Exception as e:
        result = {
            "success": False,
            "error": str(e),
            "page_count": 0,
            "char_count": 0,
            "images_extracted": 0,
            "memory_used": get_memory_usage() - initial_memory,
            "warnings": []
        }
        print(json.dumps(result))
        sys.exit(1)
    `;
    
        // Execute conversion
        const result = await new Promise<ConversionResult>((resolve, reject) => {
          const pythonProcess = spawn('python3', ['-c', pythonScript], {
            stdio: ['pipe', 'pipe', 'pipe']
          });
    
          let stdout = '';
          let stderr = '';
    
          pythonProcess.stdout.on('data', (data: Buffer) => {
            stdout += data.toString();
          });
    
          pythonProcess.stderr.on('data', (data: Buffer) => {
            stderr += data.toString();
          });
    
          pythonProcess.on('close', (code: number | null) => {
            try {
              const result = JSON.parse(stdout.trim()) as ConversionResult;
              result.processing_time = Date.now() - startTime;
              
              if (stderr.trim()) {
                result.warnings.push(stderr.trim());
              }
              
              resolve(result);
            } catch (parseError) {
              reject(new Error(`Failed to parse conversion result: ${parseError}, stdout: ${stdout}, stderr: ${stderr}`));
            }
          });
    
          pythonProcess.on('error', (error: Error) => {
            reject(error);
          });
        });
    
        return result;
    
      } catch (error) {
        return {
          success: false,
          error: error instanceof Error ? error.message : String(error),
          page_count: 0,
          char_count: 0,
          images_extracted: 0,
          processing_time: Date.now() - startTime,
          memory_used: 0,
          warnings: []
        };
      }
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. While it states the conversion action and mentions two libraries, it doesn't describe what happens during conversion (e.g., whether it preserves formatting, handles images/tables, or may fail on certain PDFs), what permissions are needed, whether it creates files or returns content, or any rate limits. The description is functional but lacks important behavioral context for a tool that performs file conversion.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that gets straight to the point. It front-loads the core functionality ('Convert PDF files to Markdown format') and adds implementation detail without unnecessary elaboration. While very concise, it could potentially benefit from slightly more context about the tool's behavior.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a file conversion tool with no annotations and no output schema, the description is insufficiently complete. It doesn't explain what the tool returns (file path? markdown content? success status?), doesn't describe error conditions or limitations, and provides minimal behavioral context. Given the complexity of PDF conversion and the lack of structured metadata, the description should provide more complete operational guidance.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description adds no additional parameter semantics beyond what's in the schema - it doesn't explain format requirements for paths, default behavior when output_path is omitted, or how the libraries handle different PDF types. The baseline score of 3 reflects adequate coverage through the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Convert PDF files'), target resource ('PDF files'), and output format ('to Markdown format'). It distinguishes from siblings by specifying the conversion purpose and mentioning implementation methods ('using marker (recommended) or pymupdf4llm'), making it easy to differentiate from tools like 'analyze_content' or 'organize_structure'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. While it mentions two implementation methods, it doesn't specify when to choose one over the other, nor does it indicate when to use this tool versus sibling tools like 'convert_missing' or 'full_workflow'. There's no mention of prerequisites, constraints, or typical use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cordlesssteve/document-organizer-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server