pdf_to_xlsx
Convert PDF files and scanned images into XLSX Excel format. Extract data from public URLs, Google Drive, or Dropbox. Supports OCR for scanned documents, custom page ranges, and password-protected files.
Instructions
Convert PDF and scanned images to XLSX (Excel 2007+) format.
Ref: https://developer.pdf.co/api-reference/pdf-to-excel/xlsx.md
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| api_key | No | PDF.co API key. If not provided, will use X_API_KEY environment variable. (Optional) | |
| httppassword | No | HTTP auth password if required to access source url. (Optional) | |
| httpusername | No | HTTP auth user name if required to access source url. (Optional) | |
| lang | No | Language for OCR for scanned documents. Default is 'eng'. See PDF.co docs for supported languages. (Optional, Default: 'eng') | eng |
| line_grouping | No | Enables line grouping within table cells when set to '1'. (Optional) | 0 |
| name | No | File name for the generated output. (Optional) | |
| pages | No | Comma-separated page indices (e.g., '0, 1, 2-' or '1, 3-7'). Use '!' for inverted page numbers (e.g., '!0' for last page). Processes all pages if None. (Optional) | |
| password | No | Password of the PDF file. (Optional) | |
| rect | No | Defines coordinates for extraction (e.g., '51.8,114.8,235.5,204.0'). (Optional) | |
| unwrap | No | Unwrap lines into a single line within table cells when lineGrouping is enabled. Must be true or false. (Optional) | |
| url | Yes | URL to the source file. Supports publicly accessible links including Google Drive, Dropbox, PDF.co Built-In Files Storage. Use 'upload_file' tool to upload local files. |