deep_crawl_site
Crawl multiple pages from a website with configurable depth and parameters to extract structured content for analysis.
Instructions
Crawl multiple pages from a site with configurable depth.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Starting URL | |
| max_depth | No | Link depth (1-2) | |
| max_pages | No | Max pages (max: 10) | |
| crawl_strategy | No | 'bfs'|'dfs'|'best_first' | bfs |
| include_external | No | Follow external links | |
| url_pattern | No | URL filter pattern | |
| score_threshold | No | Min relevance 0-1 | |
| extract_media | No | Extract media | |
| base_timeout | No | Timeout per page |