Analyze / validate a PDF
analyze_pdfInspect a PDF's structural validity, corruption, PDF/A compliance, or structural diff against another PDF. Validates file integrity and standards conformance.
Instructions
Inspect a PDF's structural health or conformance (does not read content).
Returns JSON keyed by the chosen check: validate → {valid, error_count, warning_count}; corruption → {corrupted, corruption_type, severity, found_pages, file_size, errors}; compliance → {level, is_valid, error_count, warning_count, compliance_percentage}; compare → {structurally_equivalent, content_equivalent, similarity_score, difference_count}. Read-only.
Use this to verify a file is well-formed, archival-grade, or identical to another. To read titles/author/page counts use read_pdf; for the text use extract_text.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Path to the PDF file to analyze, relative to the workspace. | |
| check | No | Which analysis to run: 'validate' = structural validity with error/warning counts; 'corruption' = damage severity and type; 'compliance' = PDF/A conformance at compliance_level; 'compare' = diff against compare_path. | validate |
| compare_path | No | Second PDF to diff against. Required when check='compare', ignored otherwise. | |
| compliance_level | No | PDF/A conformance level to test. Used only when check='compliance'. Letter = conformance class (a/b/u), number = PDF/A part (1/2/3). | a1b |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |