scout_crawl
Crawl a website from a given start URL and return its pages, with an option to limit the number of pages and stay on the same host.
Instructions
Crawl a website from a start URL and return its pages. Bound it with max_pages.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| start_url | Yes | The URL to start crawling from | |
| max_pages | No | Maximum pages to crawl | |
| same_host_only | No | Stay on the same host |