convert_missing
Convert PDF files to Markdown format when companion Markdown files are missing in the specified directory.
Instructions
Convert only PDFs that lack companion Markdown files
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| directory_path | Yes | Path to directory containing PDFs to convert |
Implementation Reference
- src/index.ts:1532-1561 (handler)Main handler for 'document_organizer__convert_missing' tool. Discovers PDF files, checks for existing Markdown companions, selectively converts missing ones, and returns detailed conversion statistics.case "document_organizer__convert_missing": { const { directory_path, pdf_files } = ConvertMissingArgsSchema.parse(args); const pdfsToCheck = pdf_files || await findPdfFiles(directory_path); const conversions = []; for (const pdfPath of pdfsToCheck) { const hasMarkdown = await checkMdExists(pdfPath); if (!hasMarkdown) { const result = await convertPdfToMd(pdfPath); conversions.push({ pdf_path: pdfPath, ...result }); } } return { content: [ { type: "text", text: JSON.stringify({ conversions_attempted: conversions.length, successful_conversions: conversions.filter(c => c.success).length, failed_conversions: conversions.filter(c => !c.success).length, results: conversions }, null, 2) } ] }; }
- src/index.ts:49-52 (schema)Zod schema defining input parameters for the convert_missing tool: directory_path (required) and optional pdf_files array.const ConvertMissingArgsSchema = z.object({ directory_path: z.string().describe("Path to directory containing PDFs to convert"), pdf_files: z.array(z.string()).optional().describe("Specific PDF files to convert (if not provided, converts all missing)") });
- src/index.ts:1311-1314 (registration)Tool registration in the tools array, defining name, description, and input schema reference.name: "document_organizer__convert_missing", description: "🔄 SELECTIVE PDF CONVERSION - Convert only PDFs that lack companion Markdown files using pymupdf4llm. Intelligently skips already-converted documents and provides detailed conversion reports with success/failure counts, processing statistics, and error diagnostics. Memory-efficient processing for large document collections.", inputSchema: zodToJsonSchema(ConvertMissingArgsSchema) as ToolInput, },
- src/index.ts:877-896 (helper)Core helper function that performs individual PDF to Markdown conversion, called by the main handler for each missing conversion.async function convertPdfToMd(pdfPath: string): Promise<{ success: boolean; mdPath?: string; error?: string }> { const dir = path.dirname(pdfPath); const basename = path.basename(pdfPath, '.pdf'); const mdPath = path.join(dir, `${basename.replace(/[^a-zA-Z0-9]/g, '_')}.md`); try { // Use marker by default for better quality const result = await convertPdfToMarkdown(pdfPath, mdPath, { engine: "marker", auto_clean: true }); if (result.success) { return { success: true, mdPath: result.output_file || mdPath }; } else { return { success: false, error: result.error }; } } catch (error) { return { success: false, error: error instanceof Error ? error.message : String(error) }; } }
- src/index.ts:854-875 (helper)Helper function to check if a corresponding Markdown file already exists for a given PDF, used to skip already converted files.async function checkMdExists(pdfPath: string): Promise<boolean> { const dir = path.dirname(pdfPath); const basename = path.basename(pdfPath, '.pdf'); // Check various possible MD naming patterns const possibleMdPaths = [ path.join(dir, `${basename}.md`), path.join(dir, `${basename.replace(/\s+/g, '_')}.md`), path.join(dir, `${basename.replace(/[^a-zA-Z0-9]/g, '_')}.md`) ]; for (const mdPath of possibleMdPaths) { try { await fs.access(mdPath); return true; } catch { // File doesn't exist, continue checking } } return false; }