process_file
Convert PDF, Office documents, and ZIP files to markdown. Persist full unsliced output to disk with output_path and receive a slim response.
Instructions
Convert PDF, Word, Excel, PowerPoint, ZIP to markdown. Use output_path to persist the full unsliced converted markdown to disk and receive a slim response.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | File URL or local path (PDF, Office, ZIP). Supports http/https URLs, file:// URIs, and absolute paths. | |
| max_size_mb | No | Max file size in MB | |
| extract_all_from_zip | No | Extract ZIP contents | |
| include_metadata | No | Include metadata | |
| auto_summarize | No | Auto-summarize large content | |
| max_content_tokens | No | Max tokens before summarization | |
| summary_length | No | 'short'|'medium'|'long' | medium |
| llm_provider | No | LLM provider | |
| llm_model | No | LLM model | |
| content_limit | No | Max characters to return (0=unlimited) | |
| content_offset | No | Start position for content (0-indexed) | |
| output_path | No | Absolute file path (auto .md extension) to persist the full unsliced converted markdown. When set, the response is slimmed to metadata+file path. content_limit/content_offset still affect the response copy but not the on-disk file. | |
| include_content_in_response | No | When True (with output_path set), keep content in the response too. Note: the response copy is still subject to content_limit/content_offset slicing; only the on-disk file holds the full unsliced payload. Defaults to False. | |
| overwrite | No | Overwrite an existing output file at output_path. Defaults to False. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||