read_pdf
Extract text from PDF files with control over page range and chunking. Specify pages, chunk size, and overlap to process only needed content.
Instructions
Extract text from PDF files with intelligent page handling and chunking.
Args:
file_path: Path to the PDF file
pages: Page range (e.g., '1,3,5-10,-1' for pages 1, 3, 5 to 10, and last page)
chunk_size: Maximum size of text chunks
chunk_overlap: Overlap between chunks to preserve context
Returns:
JSON string with extracted text and metadataInput Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | ||
| pages | No | ||
| chunk_size | No | ||
| chunk_overlap | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |