x402_crawl_site
Crawl websites via BFS to extract markdown, links, tables, images, and metadata from multiple pages. Configure depth, page limits, and path filters for structured data collection.
Instructions
Crawl a website via BFS and return per-page extraction results (markdown, links, tables, images, metadata). Price: $0.10 USDC per crawl (paid mode) | Free test: returns fixture data.
Crawls up to max_pages pages starting from the seed URL, up to max_depth link hops deep. Same extraction pipeline as x402_scrape_url — each page returns markdown, links, tables, images, metadata. Optional include_paths/exclude_paths glob filters (e.g. '/blog/*') restrict which URLs are followed. Hard limits: max 15 pages, max depth 5. Response includes pages_requested, pages_crawled, pages_skipped. Without X402_PRIVATE_KEY, only the free test endpoint is available.
Returns: seed_url, pages_requested, pages_crawled, pages_skipped, reasons_skipped, results array.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Seed URL to begin crawling (http/https, max 2048 chars) | |
| max_pages | No | Maximum pages to crawl (1-15, default: 10) | |
| max_depth | No | Maximum link depth from seed URL (1-5, default: 2) | |
| include_paths | No | Only follow URLs matching these path glob patterns (e.g. '/blog/*', max 20) | |
| exclude_paths | No | Skip URLs matching these path glob patterns (e.g. '/admin/*', max 20) |