extract_pdf
Extract figures, tables, and equations from PDF documents using layout detection. Returns base64-encoded images with metadata from academic papers or any PDF URL.
Instructions
Extract figures, tables, and equations from PDF documents using layout detection. Perfect for extracting visual elements from academic papers on arXiv or any PDF URL. Returns base64-encoded images of detected elements with metadata.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| id | No | arXiv paper ID (e.g., '2301.12345' or 'hep-th/9901001'). Either id or url is required. | |
| url | No | Direct PDF URL. Either id or url is required. | |
| max_edge | No | Maximum edge size for extracted images in pixels (default: 1024) | |
| type | No | Filter by float types (comma-separated): figure, table, equation. If not specified, returns all types. |