Skip to main content
Glama

pdf-to-markdown

Convert PDF files to Markdown format for easier editing and sharing. Extract text content from PDFs and transform it into readable Markdown text.

Instructions

Convert a PDF file to markdown

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filepathYesAbsolute path of the PDF file to convert

Implementation Reference

  • Handler dispatch for pdf-to-markdown tool (shared with other file-to-markdown tools): validates input filepath and delegates to Markdownify.toMarkdown using filePath parameter.
    case tools.PDFToMarkdownTool.name:
    case tools.ImageToMarkdownTool.name:
    case tools.AudioToMarkdownTool.name:
    case tools.DocxToMarkdownTool.name:
    case tools.XlsxToMarkdownTool.name:
    case tools.PptxToMarkdownTool.name:
      if (!validatedArgs.filepath) {
        throw new Error("File path is required for this tool");
      }
      result = await Markdownify.toMarkdown({
        filePath: validatedArgs.filepath,
        projectRoot: validatedArgs.projectRoot,
        uvPath: validatedArgs.uvPath || process.env.UV_PATH,
      });
      break;
  • Schema definition for the pdf-to-markdown tool, specifying input as absolute filepath of the PDF.
    export const PDFToMarkdownTool = ToolSchema.parse({
      name: "pdf-to-markdown",
      description: "Convert a PDF file to markdown",
      inputSchema: {
        type: "object",
        properties: {
          filepath: {
            type: "string",
            description: "Absolute path of the PDF file to convert",
          },
        },
        required: ["filepath"],
      },
    });
  • src/server.ts:33-37 (registration)
    Tool registration via listTools handler: exposes all tools from tools.ts, including pdf-to-markdown.
    server.setRequestHandler(ListToolsRequestSchema, async () => {
      return {
        tools: Object.values(tools),
      };
    });
  • Core conversion logic in Markdownify.toMarkdown: for filePath (used by pdf-to-markdown), sets inputPath to filePath, calls _markitdown to run markitdown converter, saves output to temp MD file.
    static async toMarkdown({
      filePath,
      url,
      projectRoot = path.resolve(__dirname, ".."),
      uvPath = "~/.local/bin/uv",
    }: {
      filePath?: string;
      url?: string;
      projectRoot?: string;
      uvPath?: string;
    }): Promise<MarkdownResult> {
      try {
        let inputPath: string;
        let isTemporary = false;
    
        if (url) {
          const response = await fetch(url);
    
          let extension = null;
    
          if (url.endsWith(".pdf")) {
            extension = "pdf";
          }
    
          const arrayBuffer = await response.arrayBuffer();
          const content = Buffer.from(arrayBuffer);
    
          inputPath = await this.saveToTempFile(content, extension);
          isTemporary = true;
        } else if (filePath) {
          inputPath = filePath;
        } else {
          throw new Error("Either filePath or url must be provided");
        }
    
        const text = await this._markitdown(inputPath, projectRoot, uvPath);
        const outputPath = await this.saveToTempFile(text);
    
        if (isTemporary) {
          fs.unlinkSync(inputPath);
        }
    
        return { path: outputPath, text };
      } catch (e: unknown) {
        if (e instanceof Error) {
          throw new Error(`Error processing to Markdown: ${e.message}`);
        } else {
          throw new Error("Error processing to Markdown: Unknown error occurred");
        }
      }
    }
  • Executes the markitdown binary via 'uv run' to perform the actual PDF to markdown conversion.
    private static async _markitdown(
      filePath: string,
      projectRoot: string,
      uvPath: string,
    ): Promise<string> {
      const venvPath = path.join(projectRoot, ".venv");
      const markitdownPath = path.join(
        venvPath,
        process.platform === "win32" ? "Scripts" : "bin",
        `markitdown${process.platform === "win32" ? ".exe" : ""}`,
      );
    
      if (!fs.existsSync(markitdownPath)) {
        throw new Error("markitdown executable not found");
      }
    
      // Expand tilde in uvPath if present
      const expandedUvPath = expandHome(uvPath);
    
      // Use execFile to prevent command injection
      const { stdout, stderr } = await execFileAsync(expandedUvPath, [
        "run",
        markitdownPath,
        filePath,
      ]);
    
      if (stderr) {
        throw new Error(`Error executing command: ${stderr}`);
      }
    
      return stdout;
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/zcaceres/markdownify-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server