fetch_url
Extract readable content from URLs using a cascade of fetch methods (Firecrawl, Crawl4AI, raw HTTP). Returns clean markdown, truncated to 8000 characters, with 24-hour caching.
Instructions
Fetch and extract readable content from any URL. GitHub URLs are fetched via the GitHub API; all others go through a fetch cascade: Firecrawl → Crawl4AI → raw HTTP. Returns clean markdown where possible. Content truncated to 8000 characters. Results cached for 24 hours. Blocked domains are refused.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to fetch and extract content from | |
| domain_profile | No | Named domain profile to apply: 'homelab', 'dev', or omit for default filters |