Compare PDF Structures
compare_structureCompare internal structures of two PDFs to identify differences in properties, fonts, and objects. Useful for verifying exports, tracking changes, and diagnosing generation issues.
Instructions
Compare the internal structures of two PDF documents and identify differences.
Args:
file_path_1 (string): Absolute path to the first PDF file
file_path_2 (string): Absolute path to the second PDF file
response_format ('markdown' | 'json'): Output format (default: 'markdown')
Returns: Structural comparison including: property-by-property diff (page count, PDF version, encryption, tagged status, object counts, page dimensions, file size, catalog entries, signatures), font comparison (fonts unique to each file and shared fonts), and a summary.
Examples:
Compare two versions of the same document
Verify structural consistency across PDF exports
Identify differences in PDF generation pipelines
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path_1 | Yes | Absolute path to the first PDF file for comparison | |
| file_path_2 | Yes | Absolute path to the second PDF file for comparison | |
| response_format | No | Output format: "markdown" for human-readable, "json" for structured data | markdown |
Implementation Reference
- Main handler: compares two PDF documents by analyzing their structure, fonts, and metadata in parallel, then building a diff table and font comparison.
export async function compareStructure( filePath1: string, filePath2: string, ): Promise<StructureComparison> { // Analyze both files in parallel const [struct1, struct2, fonts1, fonts2, meta1, meta2] = await Promise.all([ analyzeStructure(filePath1), analyzeStructure(filePath2), analyzeFontsWithPdfLib(filePath1), analyzeFontsWithPdfLib(filePath2), getMetadata(filePath1), getMetadata(filePath2), ]); const diffs: StructureDiffEntry[] = []; // Page count addDiff( diffs, 'Page Count', String(struct1.pageTree.totalPages), String(struct2.pageTree.totalPages), ); // PDF version addDiff(diffs, 'PDF Version', struct1.pdfVersion ?? 'Unknown', struct2.pdfVersion ?? 'Unknown'); // Encrypted addDiff(diffs, 'Encrypted', String(struct1.isEncrypted), String(struct2.isEncrypted)); // Tagged addDiff(diffs, 'Tagged', String(meta1.isTagged), String(meta2.isTagged)); // Total objects addDiff( diffs, 'Total Objects', String(struct1.objectStats.totalObjects), String(struct2.objectStats.totalObjects), ); // Stream count addDiff( diffs, 'Stream Count', String(struct1.objectStats.streamCount), String(struct2.objectStats.streamCount), ); // First page dimensions const dim1 = struct1.pageTree.mediaBoxSamples[0]; const dim2 = struct2.pageTree.mediaBoxSamples[0]; addDiff( diffs, 'Page 1 Dimensions (pt)', dim1 ? `${dim1.width} x ${dim1.height}` : 'N/A', dim2 ? `${dim2.width} x ${dim2.height}` : 'N/A', ); // File size addDiff(diffs, 'File Size', formatFileSize(meta1.fileSize), formatFileSize(meta2.fileSize)); // Catalog entry count addDiff(diffs, 'Catalog Entries', String(struct1.catalog.length), String(struct2.catalog.length)); // Signatures addDiff(diffs, 'Has Signatures', String(meta1.hasSignatures), String(meta2.hasSignatures)); // Font comparison const fontNames1 = new Set(fonts1.fontMap.keys()); const fontNames2 = new Set(fonts2.fontMap.keys()); const onlyInFile1 = [...fontNames1].filter((f) => !fontNames2.has(f)); const onlyInFile2 = [...fontNames2].filter((f) => !fontNames1.has(f)); const inBoth = [...fontNames1].filter((f) => fontNames2.has(f)); addDiff(diffs, 'Total Fonts', String(fontNames1.size), String(fontNames2.size)); // Summary const matchCount = diffs.filter((d) => d.status === 'match').length; const diffCount = diffs.filter((d) => d.status === 'differ').length; const summary = diffCount === 0 ? `All ${matchCount} properties match between the two PDFs.` : `${diffCount} difference(s) found out of ${diffs.length} properties compared.`; return { file1: basename(filePath1), file2: basename(filePath2), diffs, fontComparison: { onlyInFile1, onlyInFile2, inBoth }, summary, }; } - src/schemas/tier3.ts:25-31 (schema)Zod input schema for compare_structure: requires file_path_1, file_path_2, and optional response_format.
export const CompareStructureSchema = z .object({ file_path_1: FilePathSchema.describe('Absolute path to the first PDF file for comparison'), file_path_2: FilePathSchema.describe('Absolute path to the second PDF file for comparison'), response_format: ResponseFormatSchema, }) .strict(); - src/tools/tier3/compare-structure.ts:12-59 (registration)Registers the compare_structure tool on the MCP server with schema, annotations, and async handler callback.
export function registerCompareStructure(server: McpServer): void { server.registerTool( 'compare_structure', { title: 'Compare PDF Structures', description: `Compare the internal structures of two PDF documents and identify differences. Args: - file_path_1 (string): Absolute path to the first PDF file - file_path_2 (string): Absolute path to the second PDF file - response_format ('markdown' | 'json'): Output format (default: 'markdown') Returns: Structural comparison including: property-by-property diff (page count, PDF version, encryption, tagged status, object counts, page dimensions, file size, catalog entries, signatures), font comparison (fonts unique to each file and shared fonts), and a summary. Examples: - Compare two versions of the same document - Verify structural consistency across PDF exports - Identify differences in PDF generation pipelines`, inputSchema: CompareStructureSchema, annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: true, openWorldHint: false, }, }, async (params: CompareStructureInput) => { try { const result = await compareStructure(params.file_path_1, params.file_path_2); const raw = params.response_format === ResponseFormat.JSON ? JSON.stringify(result, null, 2) : formatCompareStructureMarkdown(result); const { text } = truncateIfNeeded(raw); return { content: [{ type: 'text' as const, text }] }; } catch (error) { const err = handleStructuredError(error); return { content: [{ type: 'text' as const, text: JSON.stringify(err, null, 2) }], isError: true, }; } }, ); } - src/utils/formatter.ts:446-483 (helper)Helper that formats the StructureComparison result as Markdown with a property diff table, font comparison section, and summary.
export function formatCompareStructureMarkdown(result: StructureComparison): string { const lines: string[] = ['# PDF Structure Comparison', '']; lines.push(`Comparing **${result.file1}** vs **${result.file2}**`); // Property diff table lines.push('', '## Property Comparison', ''); lines.push('| Property | File 1 | File 2 | Status |', '|---|---|---|---|'); for (const diff of result.diffs) { const statusIcon = diff.status === 'match' ? '\u2705' : '\u274c'; lines.push(`| ${diff.property} | ${diff.file1Value} | ${diff.file2Value} | ${statusIcon} |`); } // Font comparison const fc = result.fontComparison; lines.push('', '## Font Comparison', ''); if (fc.inBoth.length > 0) { lines.push(`- **Shared fonts** (${fc.inBoth.length}): ${fc.inBoth.join(', ')}`); } if (fc.onlyInFile1.length > 0) { lines.push( `- **Only in ${result.file1}** (${fc.onlyInFile1.length}): ${fc.onlyInFile1.join(', ')}`, ); } if (fc.onlyInFile2.length > 0) { lines.push( `- **Only in ${result.file2}** (${fc.onlyInFile2.length}): ${fc.onlyInFile2.join(', ')}`, ); } if (fc.inBoth.length === 0 && fc.onlyInFile1.length === 0 && fc.onlyInFile2.length === 0) { lines.push('No fonts found in either document.'); } lines.push('', '---', '', `**Summary**: ${result.summary}`); return lines.join('\n'); } - src/types.ts:292-310 (helper)Type definitions: StructureDiffEntry (single diff row) and StructureComparison (full output type).
export interface StructureDiffEntry { property: string; file1Value: string; file2Value: string; status: 'match' | 'differ'; } /** compare_structure output */ export interface StructureComparison { file1: string; file2: string; diffs: StructureDiffEntry[]; fontComparison: { onlyInFile1: string[]; onlyInFile2: string[]; inBoth: string[]; }; summary: string; }