Summarize PDF
summarizeGenerate a quick overview of any PDF document, including page count, file size, text presence, image count, and a text preview from the first page. Use it as a first step to decide which detailed tools to apply next.
Instructions
Generate a quick overview report of a PDF document.
Combines metadata, text presence check, image count, and a text preview from the first page into a single summary. Useful as a first step before deciding which detailed tools to use.
Args:
file_path (string): Absolute path to a local PDF file
response_format ('markdown' | 'json'): Output format (default: 'markdown')
Returns: Summary including: page count, PDF version, file size, tagged/encrypted/signature flags, text presence, image count, and a text preview from page 1.
Examples:
Quick overview: { file_path: "/path/to/doc.pdf" }
Machine-readable: { file_path: "/path/to/doc.pdf", response_format: "json" }
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | Absolute path to a local PDF file (e.g., "/path/to/document.pdf") | |
| response_format | No | Output format: "markdown" for human-readable, "json" for structured data | markdown |
Implementation Reference
- src/tools/tier1/summarize.ts:18-88 (handler)Main handler function for the 'summarize' tool. Registers the tool with the MCP server, loads a PDF document, extracts metadata, text preview (first 500 chars), and image count, then returns the result as either markdown or JSON.
export function registerSummarize(server: McpServer): void { server.registerTool( 'summarize', { title: 'Summarize PDF', description: `Generate a quick overview report of a PDF document. Combines metadata, text presence check, image count, and a text preview from the first page into a single summary. Useful as a first step before deciding which detailed tools to use. Args: - file_path (string): Absolute path to a local PDF file - response_format ('markdown' | 'json'): Output format (default: 'markdown') Returns: Summary including: page count, PDF version, file size, tagged/encrypted/signature flags, text presence, image count, and a text preview from page 1. Examples: - Quick overview: { file_path: "/path/to/doc.pdf" } - Machine-readable: { file_path: "/path/to/doc.pdf", response_format: "json" }`, inputSchema: SummarizeSchema, annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: false, }, }, async (params: SummarizeInput) => { try { // Load the PDF document once and reuse for all operations const doc = await loadDocument(params.file_path); try { const [metadata, firstPageTexts, imageCount] = await Promise.all([ getMetadataFromDoc(doc, params.file_path), extractTextFromDoc(doc, '1'), countImagesFromDoc(doc), ]); const textPreview = firstPageTexts[0]?.text?.slice(0, 500) ?? ''; const hasText = textPreview.trim().length > 0; const summary: PdfSummary = { filePath: params.file_path, metadata, textPreview, imageCount, hasText, }; const text = params.response_format === ResponseFormat.JSON ? JSON.stringify(summary, null, 2) : formatSummaryMarkdown(summary); return { content: [{ type: 'text' as const, text }], }; } finally { await doc.destroy(); } } catch (error) { const err = handleStructuredError(error); return { content: [{ type: 'text' as const, text: JSON.stringify(err, null, 2) }], isError: true, }; } }, ); } - src/schemas/tier1.ts:125-131 (schema)Zod schema for the summarize tool input. Defines file_path (string) and response_format ('markdown' | 'json') as required parameters.
/** summarize */ export const SummarizeSchema = z .object({ file_path: FilePathSchema, response_format: ResponseFormatSchema, }) .strict(); - src/tools/index.ts:14-14 (registration)Import of registerSummarize in the central tool registration file.
import { registerSummarize } from './tier1/summarize.js'; - src/tools/index.ts:38-38 (registration)Call to registerSummarize(server) from the central registerAllTools function.
registerSummarize(server); - src/utils/formatter.ts:117-141 (helper)Helper function formatSummaryMarkdown that formats the PdfSummary object into a Markdown table with properties like page count, PDF version, file size, tagged/encrypted/signature flags, text presence, image count, and a text preview from the first page.
export function formatSummaryMarkdown(summary: PdfSummary): string { const meta = summary.metadata; const lines: string[] = [ `# PDF Summary`, '', `| Property | Value |`, `|---|---|`, `| Pages | ${meta.pageCount} |`, `| PDF Version | ${meta.pdfVersion ?? 'Unknown'} |`, `| File Size | ${formatFileSize(meta.fileSize)} |`, `| Tagged | ${meta.isTagged ? 'Yes' : 'No'} |`, `| Encrypted | ${meta.isEncrypted ? 'Yes' : 'No'} |`, `| Signatures | ${meta.hasSignatures ? 'Yes' : 'No'} |`, `| Has Text | ${summary.hasText ? 'Yes' : 'No'} |`, `| Images | ${summary.imageCount} |`, ]; if (meta.title) lines.push(`| Title | ${meta.title} |`); if (meta.author) lines.push(`| Author | ${meta.author} |`); if (summary.textPreview) { lines.push('', '## Text Preview (first page)', '', summary.textPreview); } return lines.join('\n');