Skip to main content
Glama

read_pdf

Extract text content from PDF files stored on disk. Provide the file path to retrieve readable text from PDF documents.

Instructions

Read a PDF file from disk and return its text content

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYesAbsolute or relative path to the PDF file

Implementation Reference

  • Handler for the 'read_pdf' tool that validates the file path, reads the PDF file, parses it using the 'pdf-parse' library, and returns the extracted text content.
    case "read_pdf": { const { file_path } = args as any; if (!file_path || typeof file_path !== 'string') { throw new Error('file_path is required and must be a string'); } const resolvedPath = path.resolve(file_path); const isAllowed = ALLOWED_DIRS.some(allowedDir => resolvedPath.startsWith(path.resolve(allowedDir))); if (!isAllowed) { throw new Error(`file_path must be inside allowed directories: ${ALLOWED_DIRS.join(', ')}`); } const data = await fs.readFile(resolvedPath); let pdfParse: any; try { const require = createRequire(import.meta.url); pdfParse = require('pdf-parse'); } catch (e) { throw new Error('Dependency "pdf-parse" is not installed. Please run `npm install pdf-parse` in pdftools-mcp'); } const parsed: any = await pdfParse(data); return { content: [ { type: 'text', text: parsed.text || '' } ] }; }
  • index.ts:159-172 (registration)
    Registration of the 'read_pdf' tool in the tools list, including name, description, and input schema definition.
    { name: "read_pdf", description: "Read a PDF file from disk and return its text content", inputSchema: { type: "object", properties: { file_path: { type: "string", description: "Absolute or relative path to the PDF file" } }, required: ["file_path"] } }
  • Input schema for the 'read_pdf' tool defining the required 'file_path' parameter.
    inputSchema: { type: "object", properties: { file_path: { type: "string", description: "Absolute or relative path to the PDF file" } }, required: ["file_path"] }
  • TypeScript type definitions for the 'pdf-parse' library used in the read_pdf handler.
    declare module 'pdf-parse' { import { Buffer } from 'buffer'; export default function pdfParse(data: Buffer | Uint8Array | string): Promise<{ numpages?: number; numrender?: number; info?: any; metadata?: any; version?: string; text: string; textAsHtml?: string; }>; }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Theorhd/Pdftools-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server