extract_content
Extract text content from specified pages of a PDF file using precise page selection. Ideal for targeted content retrieval from NCCN clinical guideline documents.
Instructions
Extract content from specific pages of a PDF file.
Args:
pdf_path: Path to the PDF file (relative to the downloads directory or absolute path)
pages: Comma-separated page numbers to extract (e.g., "1,3,5-7").
If not specified, extracts all pages. Supports negative indexing (-1 for last page).
Returns:
Extracted text content from the specified pages
Input Schema
Name | Required | Description | Default |
---|---|---|---|
pages | No | ||
pdf_path | Yes |
Input Schema (JSON Schema)
{
"properties": {
"pages": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Pages"
},
"pdf_path": {
"title": "Pdf Path",
"type": "string"
}
},
"required": [
"pdf_path"
],
"title": "extract_contentArguments",
"type": "object"
}