convert_pdf_to_markdown
Convert PDF files to markdown format with optional image extraction. Specify absolute paths for PDF input, image output, and markdown saving to streamline processing for LLMs or other use cases.
Instructions
Converts a PDF file to markdown format via pymupdf4llm. See pymupdf.readthedocs.io/en/latest/pymupdf4llm for more. The file_path
, image_path
, and save_path
parameters should be the absolute path to the PDF file, not a relative path. This tool will also convert the PDF to images and save them in the image_path
directory. For larger PDF files, use save_path
to save the markdown file then read it partially.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
file_path | Yes | Absolute path to the PDF file to convert | |
image_path | No | Optional. Absolute path to the directory to save the images. If not provided, the images will be saved in the same directory as the PDF file. | |
save_path | No | Optional. Absolute path to the directory to save the markdown file. If provided, will return the path to the markdown file. If not provided, will return the markdown string. |