list_pdfs | List PDF files in configured base paths
Args:
path_filter: Optional string to filter PDF paths
Returns:
List of PDF paths matching the filter |
extract_form_fields | Extract all form fields from a PDF
Args:
pdf_path: Path to the PDF file
Returns:
Dictionary of form field names and their properties |
highlight_form_field | Generate an image with the specified form field highlighted with a red box
Args:
pdf_path: Path to the PDF file
field_name: Name of the form field to highlight
Returns:
Image of the page with the field highlighted |
render_pdf_page | Generate an image of a PDF page without any highlighting
Args:
pdf_path: Path to the PDF file
page_num: Page number to render (0-indexed)
zoom: Zoom factor for rendering (higher values for better quality)
Returns:
Image of the specified page |
extract_text | Extract text from PDF pages
Args:
pdf_path: Path to the PDF file
start_page: Page number to start extraction (0-indexed). If None, starts from first page.
end_page: Page number to end extraction (0-indexed, inclusive). If None, ends at start_page if specified, otherwise extracts all pages.
Returns:
If extracting a single page: string containing the page text
If extracting multiple pages: dictionary mapping page numbers to page text |
search_text | Search for text pattern in a PDF file
Args:
pdf_path: Path to the PDF file
pattern: Regular expression pattern to search for
case_sensitive: Whether to perform case-sensitive matching
start_page: Page number to start search (0-indexed). If None, starts from first page.
end_page: Page number to end search (0-indexed, inclusive). If None, searches all pages.
Returns:
List of matches with page number, match text, and context |