crawl-site
Scan and extract all unique URLs from a website by recursively crawling from a given URL up to a specified depth. Designed for web content analysis.
Instructions
Recursively crawls a website starting from a given URL up to a specified maximum depth. It follows links within the same origin and returns a list of all unique URLs found during the crawl.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
maxDepth | No | The maximum depth to crawl relative to the starting URL. 0 means only the starting URL is fetched. Max allowed depth is 5 to prevent excessive crawling. Defaults to 2. | |
url | Yes | The starting URL for the crawl. Must be a valid HTTP or HTTPS URL. |