scrape_crawl
Start a web crawl from a seed URL, returning a job ID for progress tracking. Use URL glob patterns to include or exclude pages within the crawl.
Instructions
Start a crawl job from a seed URL. Returns immediately with a job_id.
Poll progress with scrape_job_status(job_id). Use include_patterns / exclude_patterns (URL glob patterns) to scope the crawl.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | ||
| format | No | markdown | |
| max_depth | No | ||
| max_pages | No | ||
| concurrency | No | ||
| include_patterns | No | ||
| exclude_patterns | No | ||
| main_content | No | ||
| timeout_ms | No |