Skip to main content
Glama

convert_missing

Convert PDF files to Markdown format when companion Markdown files are missing in the specified directory.

Instructions

Convert only PDFs that lack companion Markdown files

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
directory_pathYesPath to directory containing PDFs to convert

Implementation Reference

  • Main handler for 'document_organizer__convert_missing' tool. Discovers PDF files, checks for existing Markdown companions, selectively converts missing ones, and returns detailed conversion statistics.
    case "document_organizer__convert_missing": { const { directory_path, pdf_files } = ConvertMissingArgsSchema.parse(args); const pdfsToCheck = pdf_files || await findPdfFiles(directory_path); const conversions = []; for (const pdfPath of pdfsToCheck) { const hasMarkdown = await checkMdExists(pdfPath); if (!hasMarkdown) { const result = await convertPdfToMd(pdfPath); conversions.push({ pdf_path: pdfPath, ...result }); } } return { content: [ { type: "text", text: JSON.stringify({ conversions_attempted: conversions.length, successful_conversions: conversions.filter(c => c.success).length, failed_conversions: conversions.filter(c => !c.success).length, results: conversions }, null, 2) } ] }; }
  • Zod schema defining input parameters for the convert_missing tool: directory_path (required) and optional pdf_files array.
    const ConvertMissingArgsSchema = z.object({ directory_path: z.string().describe("Path to directory containing PDFs to convert"), pdf_files: z.array(z.string()).optional().describe("Specific PDF files to convert (if not provided, converts all missing)") });
  • src/index.ts:1311-1314 (registration)
    Tool registration in the tools array, defining name, description, and input schema reference.
    name: "document_organizer__convert_missing", description: "🔄 SELECTIVE PDF CONVERSION - Convert only PDFs that lack companion Markdown files using pymupdf4llm. Intelligently skips already-converted documents and provides detailed conversion reports with success/failure counts, processing statistics, and error diagnostics. Memory-efficient processing for large document collections.", inputSchema: zodToJsonSchema(ConvertMissingArgsSchema) as ToolInput, },
  • Core helper function that performs individual PDF to Markdown conversion, called by the main handler for each missing conversion.
    async function convertPdfToMd(pdfPath: string): Promise<{ success: boolean; mdPath?: string; error?: string }> { const dir = path.dirname(pdfPath); const basename = path.basename(pdfPath, '.pdf'); const mdPath = path.join(dir, `${basename.replace(/[^a-zA-Z0-9]/g, '_')}.md`); try { // Use marker by default for better quality const result = await convertPdfToMarkdown(pdfPath, mdPath, { engine: "marker", auto_clean: true }); if (result.success) { return { success: true, mdPath: result.output_file || mdPath }; } else { return { success: false, error: result.error }; } } catch (error) { return { success: false, error: error instanceof Error ? error.message : String(error) }; } }
  • Helper function to check if a corresponding Markdown file already exists for a given PDF, used to skip already converted files.
    async function checkMdExists(pdfPath: string): Promise<boolean> { const dir = path.dirname(pdfPath); const basename = path.basename(pdfPath, '.pdf'); // Check various possible MD naming patterns const possibleMdPaths = [ path.join(dir, `${basename}.md`), path.join(dir, `${basename.replace(/\s+/g, '_')}.md`), path.join(dir, `${basename.replace(/[^a-zA-Z0-9]/g, '_')}.md`) ]; for (const mdPath of possibleMdPaths) { try { await fs.access(mdPath); return true; } catch { // File doesn't exist, continue checking } } return false; }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cordlesssteve/document-organizer-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server