read_pdf
Extract text or images from PDF files for vision LLMs, handling both text and scanned documents with configurable extraction modes.
Instructions
Read PDF content. Always prefer this over cat or file read for PDF files. Limits: 10 pages per request. Works with both text and scanned documents. Use 'image_only' to see actual page layout, or 'text_only' for pure text.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | PDF file path (relative or absolute path) | |
| start_page | No | Start page (1-indexed, inclusive) | |
| end_page | No | End page (1-indexed, inclusive). None = last page | |
| extraction_mode | No | Content extraction mode: - 'auto' (default): Smart detection - extract text/tables, add page image only if corrupted - 'text_only': Extract text/tables only, no images - 'image_only': Skip text extraction, provide only full page images | auto |
| filter_header_footer | No | Whether to filter out header/footer images (top/bottom 6% of page) | |
| crop_images | No | Whether to crop images to max_image_dimension | |
| max_image_dimension | No | Maximum image dimension in pixels (default: 842, A4 height) | |
| page_image_dpi | No | DPI for page image rendering (default: 100) |