convert_pdf_to_markdown
Convert PDF files to markdown format while extracting and saving images. Handles large files by saving markdown to a specified path. Ideal for LLM processing and structured document conversion.
Instructions
Converts a PDF file to markdown format via pymupdf4llm. See pymupdf.readthedocs.io/en/latest/pymupdf4llm for more. The file_path
, image_path
, and save_path
parameters should be the absolute path to the PDF file, not a relative path. This tool will also convert the PDF to images and save them in the image_path
directory. For larger PDF files, use save_path
to save the markdown file then read it partially.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
file_path | Yes | Absolute path to the PDF file to convert | |
image_path | No | Optional. Absolute path to the directory to save the images. If not provided, the images will be saved in the same directory as the PDF file. | |
save_path | No | Optional. Absolute path to the directory to save the markdown file. If provided, will return the path to the markdown file. If not provided, will return the markdown string. |