fetch_url
Fetch any URL with automatic JavaScript rendering and bot-protection handling. Optionally strip HTML noise for efficient LLM consumption.
Instructions
Fetch any URL with automatic JS-rendering and common bot-protection handling — advanced behavioral fingerprinting may still block header retrieval (surfaced via headersAvailable: false). Returns body, headers, cleanStats. Optional cleanHtml strips HTML noise while preserving text content — token-cost win for LLM consumption.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Target URL. Must be http:// or https:// and resolve to a public host (IPv4/IPv6 literals and localhost are rejected). | |
| cleanHtml | No | Strip scripts/styles/comments from text/html responses. Requires bodyNeeded. Significant token-cost reduction for LLM consumption. Default: false. | |
| bodyNeeded | No | Include body + contentType in response. Default: true. | |
| bodyMaxBytes | No | Per-request body cap in bytes. Range: 1024–104857600 (1 KiB–100 MiB). | |
| maxTimeoutMs | No | Caller timeout budget in ms. Range: 1000–120000. | |
| headersNeeded | No | Include headers + headersAvailable in response. Default: false. At least one of bodyNeeded or headersNeeded must be true. |