read_pdf
Extract text content from PDF files stored on disk. Provide the file path to retrieve readable text from PDF documents.
Instructions
Read a PDF file from disk and return its text content
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | Absolute or relative path to the PDF file |
Implementation Reference
- index.ts:221-254 (handler)Handler for the 'read_pdf' tool that validates the file path, reads the PDF file, parses it using the 'pdf-parse' library, and returns the extracted text content.case "read_pdf": { const { file_path } = args as any; if (!file_path || typeof file_path !== 'string') { throw new Error('file_path is required and must be a string'); } const resolvedPath = path.resolve(file_path); const isAllowed = ALLOWED_DIRS.some(allowedDir => resolvedPath.startsWith(path.resolve(allowedDir))); if (!isAllowed) { throw new Error(`file_path must be inside allowed directories: ${ALLOWED_DIRS.join(', ')}`); } const data = await fs.readFile(resolvedPath); let pdfParse: any; try { const require = createRequire(import.meta.url); pdfParse = require('pdf-parse'); } catch (e) { throw new Error('Dependency "pdf-parse" is not installed. Please run `npm install pdf-parse` in pdftools-mcp'); } const parsed: any = await pdfParse(data); return { content: [ { type: 'text', text: parsed.text || '' } ] }; }
- index.ts:159-172 (registration)Registration of the 'read_pdf' tool in the tools list, including name, description, and input schema definition.{ name: "read_pdf", description: "Read a PDF file from disk and return its text content", inputSchema: { type: "object", properties: { file_path: { type: "string", description: "Absolute or relative path to the PDF file" } }, required: ["file_path"] } }
- index.ts:162-171 (schema)Input schema for the 'read_pdf' tool defining the required 'file_path' parameter.inputSchema: { type: "object", properties: { file_path: { type: "string", description: "Absolute or relative path to the PDF file" } }, required: ["file_path"] }
- src/types/pdf-parse.d.ts:1-13 (helper)TypeScript type definitions for the 'pdf-parse' library used in the read_pdf handler.declare module 'pdf-parse' { import { Buffer } from 'buffer'; export default function pdfParse(data: Buffer | Uint8Array | string): Promise<{ numpages?: number; numrender?: number; info?: any; metadata?: any; version?: string; text: string; textAsHtml?: string; }>; }