extract_pdf_metadata
Access and retrieve metadata and document details from PDF files using a robust server. Input the file path to extract essential information efficiently.
Instructions
Extract metadata and document information from PDF files
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | Path to the PDF file to extract metadata from |
Implementation Reference
- src/tools/extract-metadata.ts:22-32 (handler)Main handler function that validates input parameters and delegates metadata extraction to MetadataParserexport async function handleExtractMetadata(args: unknown): Promise<PDFMetadata> { try { const params = ExtractMetadataParamsSchema.parse(args); const parser = new MetadataParser(); return await parser.parseMetadata(params.file_path); } catch (error) { const mcpError = handleError(error, typeof args === 'object' && args !== null && 'file_path' in args ? String(args.file_path) : undefined); throw new Error(JSON.stringify(mcpError)); } }
- Core parsing logic using pdf-parse library to extract metadata from PDF buffer, including validation, file reading, and timeout handlingasync parseMetadata(filePath: string): Promise<PDFMetadata> { await validatePDFFile(filePath); const buffer = await fs.readFile(filePath); const stats = await fs.stat(filePath); const pdfData = await withTimeout( pdf(buffer), this.config.processingTimeout ); return this.formatMetadata(pdfData, stats.size); }
- src/types/mcp-types.ts:14-16 (schema)Zod schema for validating the tool input parameters (file_path)export const ExtractMetadataParamsSchema = z.object({ file_path: filePathValidation });
- src/index.ts:63-71 (registration)Tool dispatch/registration in the main server request handler switch statementcase 'extract_pdf_metadata': return { content: [ { type: 'text', text: JSON.stringify(await handleExtractMetadata(args), null, 2), }, ], };
- src/tools/extract-metadata.ts:8-20 (schema)Tool definition including name, description, and input schema for MCP registrationname: 'extract_pdf_metadata', description: 'Extract metadata and document information from PDF files', inputSchema: { type: 'object', properties: { file_path: { type: 'string', description: 'Path to the PDF file to extract metadata from' } }, required: ['file_path'] } };