extract_urls
Crawl a webpage to extract, validate, and optionally queue all hyperlinks for processing, aiding in documentation discovery and graph building.
Instructions
Extract and analyze all URLs from a given web page. This tool crawls the specified webpage, identifies all hyperlinks, and optionally adds them to the processing queue. Useful for discovering related documentation pages, API references, or building a documentation graph. Handles various URL formats and validates links before extraction.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
add_to_queue | No | If true, automatically add extracted URLs to the processing queue for later indexing. This enables recursive documentation discovery. Use with caution on large sites to avoid excessive queuing. | |
url | Yes | The complete URL of the webpage to analyze (must include protocol, e.g., https://). The page must be publicly accessible. |