parse_pdf

parse_pdf

Extract content from PDF files via local paths or URLs and convert it to structured JSON or Markdown format for data processing.

Instructions

Parses a PDF file and returns the extracted content in the specified format. The function supports both local file paths and remote URLs as input sources. It extracts the content from the PDF and formats it either as structured JSON or as a Markdown string. :param source: The source of the PDF file to be parsed. - If it is a string starting with "http://" or "https://", it will be treated as a remote URL. - Otherwise, it will be treated as a local file path (absolute path recommended, e.g. "/Users/yourname/file.pdf"). :param format: The desired format for the parsed output. Supports: - "json": Returns the extracted content as a dictionary. - "markdown": Returns the extracted content as a Markdown-formatted string. :return: The extracted content in the specified format (JSON dictionary or Markdown string).

Input Schema

TableJSON Schema

Name	Required	Description	Default
`source`	Yes
`format`	No		json

NetMind ParsePro

Instructions

Input Schema

Other Tools

Latest Blog Posts

MCP directory API