Skip to main content
Glama
cordlesssteve

Document Organizer MCP Server

convert_missing

Convert PDF files to Markdown format when companion Markdown files are missing in the specified directory.

Instructions

Convert only PDFs that lack companion Markdown files

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
directory_pathYesPath to directory containing PDFs to convert

Implementation Reference

  • Main handler for 'document_organizer__convert_missing' tool. Discovers PDF files, checks for existing Markdown companions, selectively converts missing ones, and returns detailed conversion statistics.
    case "document_organizer__convert_missing": {
      const { directory_path, pdf_files } = ConvertMissingArgsSchema.parse(args);
      const pdfsToCheck = pdf_files || await findPdfFiles(directory_path);
      
      const conversions = [];
      for (const pdfPath of pdfsToCheck) {
        const hasMarkdown = await checkMdExists(pdfPath);
        if (!hasMarkdown) {
          const result = await convertPdfToMd(pdfPath);
          conversions.push({
            pdf_path: pdfPath,
            ...result
          });
        }
      }
      
      return {
        content: [
          {
            type: "text",
            text: JSON.stringify({
              conversions_attempted: conversions.length,
              successful_conversions: conversions.filter(c => c.success).length,
              failed_conversions: conversions.filter(c => !c.success).length,
              results: conversions
            }, null, 2)
          }
        ]
      };
    }
  • Zod schema defining input parameters for the convert_missing tool: directory_path (required) and optional pdf_files array.
    const ConvertMissingArgsSchema = z.object({
      directory_path: z.string().describe("Path to directory containing PDFs to convert"),
      pdf_files: z.array(z.string()).optional().describe("Specific PDF files to convert (if not provided, converts all missing)")
    });
  • src/index.ts:1311-1314 (registration)
    Tool registration in the tools array, defining name, description, and input schema reference.
      name: "document_organizer__convert_missing",
      description: "🔄 SELECTIVE PDF CONVERSION - Convert only PDFs that lack companion Markdown files using pymupdf4llm. Intelligently skips already-converted documents and provides detailed conversion reports with success/failure counts, processing statistics, and error diagnostics. Memory-efficient processing for large document collections.",
      inputSchema: zodToJsonSchema(ConvertMissingArgsSchema) as ToolInput,
    },
  • Core helper function that performs individual PDF to Markdown conversion, called by the main handler for each missing conversion.
    async function convertPdfToMd(pdfPath: string): Promise<{ success: boolean; mdPath?: string; error?: string }> {
      const dir = path.dirname(pdfPath);
      const basename = path.basename(pdfPath, '.pdf');
      const mdPath = path.join(dir, `${basename.replace(/[^a-zA-Z0-9]/g, '_')}.md`);
      
      try {
        // Use marker by default for better quality
        const result = await convertPdfToMarkdown(pdfPath, mdPath, { engine: "marker", auto_clean: true });
        if (result.success) {
          return { success: true, mdPath: result.output_file || mdPath };
        } else {
          return { success: false, error: result.error };
        }
      } catch (error) {
        return { 
          success: false, 
          error: error instanceof Error ? error.message : String(error) 
        };
      }
    }
  • Helper function to check if a corresponding Markdown file already exists for a given PDF, used to skip already converted files.
    async function checkMdExists(pdfPath: string): Promise<boolean> {
      const dir = path.dirname(pdfPath);
      const basename = path.basename(pdfPath, '.pdf');
      
      // Check various possible MD naming patterns
      const possibleMdPaths = [
        path.join(dir, `${basename}.md`),
        path.join(dir, `${basename.replace(/\s+/g, '_')}.md`),
        path.join(dir, `${basename.replace(/[^a-zA-Z0-9]/g, '_')}.md`)
      ];
      
      for (const mdPath of possibleMdPaths) {
        try {
          await fs.access(mdPath);
          return true;
        } catch {
          // File doesn't exist, continue checking
        }
      }
      
      return false;
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions the tool converts PDFs, implying a mutation operation, but doesn't disclose behavioral traits such as whether it overwrites existing files, requires specific permissions, handles errors, or what the output format is. The description is minimal and lacks crucial operational details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that directly states the tool's function without unnecessary words. It is front-loaded with the core action and condition, making it easy to understand quickly. Every part of the sentence contributes to the tool's purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a file conversion tool with no annotations and no output schema, the description is incomplete. It doesn't explain what the conversion outputs (e.g., Markdown files), how it handles errors, or any prerequisites. For a mutation tool with minimal structured data, more detail is needed to ensure proper usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the parameter 'directory_path' clearly documented. The description doesn't add any meaning beyond the schema, as it doesn't elaborate on the parameter's role or constraints. With high schema coverage, the baseline score of 3 is appropriate, as the description doesn't compensate but also doesn't detract.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: converting PDFs that lack companion Markdown files. It specifies the resource (PDFs) and the condition (missing Markdown files), though it doesn't explicitly mention the output format or how the conversion is performed. It distinguishes from 'convert_pdf' by focusing on missing Markdown files, but doesn't fully differentiate from other siblings like 'full_workflow' or 'init_project_docs'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when PDFs need conversion and Markdown files are absent, but doesn't explicitly state when to use this tool versus alternatives like 'convert_pdf' or 'full_workflow'. It provides some context by mentioning the condition (lack of companion Markdown files), but lacks clear exclusions or named alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cordlesssteve/document-organizer-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server