crawl_url_with_fallback
Extract web content using fallback strategies to handle anti-bot protections, with options for pagination, media extraction, and markdown generation.
Instructions
Crawl with fallback strategies for anti-bot sites. Use content_offset/content_limit for pagination.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to crawl | |
| css_selector | No | CSS selector | |
| extract_media | No | Extract media | |
| take_screenshot | No | Take screenshot | |
| generate_markdown | No | Generate markdown | |
| wait_for_selector | No | Element to wait for | |
| timeout | No | Timeout in seconds | |
| wait_for_js | No | Wait for JavaScript | |
| auto_summarize | No | Auto-summarize content | |
| content_limit | No | Max characters to return (0=unlimited) | |
| content_offset | No | Start position for content (0-indexed) |