extract_content
Extract and clean web page content into structured Markdown with citations, supporting multiple formats and image/link inclusion.
Instructions
Extract and clean content from a web page, returning Markdown with citation
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The URL to fetch and extract content from | |
| format | No | Output format (default: markdown) | markdown |
| includeImages | No | Whether to include images in the output (default: true) | |
| includeLinks | No | Whether to include links in the output (default: true) | |
| bypassRobots | No | Whether to bypass robots.txt restrictions (default: false) | |
| useCache | No | Whether to use cached content if available (default: true) |