read_pdf
Extract text from PDF files while preserving LaTeX mathematical notations and document layout structure, ideal for processing scientific papers.
Instructions
Read and extract text content from a PDF file. Uses a Python backend (marker-pdf) to preserve mathematical notations (LaTeX) and layout structure. Best for scientific papers.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | Absolute path to the PDF file | |
| start_page | No | Start page (1-based) | |
| end_page | No | End page (1-based) |