crw_scrape
Scrape any web page to get clean markdown, HTML, or links. Optionally render JavaScript, filter with CSS selectors, and wait for dynamic content.
Instructions
Scrape a single URL and return its content as markdown, HTML, or links. Use this to extract content from any web page.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| excludeTags | No | CSS selectors to exclude from output | |
| formats | No | Output formats (default: ["markdown"]) | |
| includeTags | No | CSS selectors to include (only content matching these selectors) | |
| onlyMainContent | No | Extract only the main content, removing nav/footer/etc (default: true) | |
| renderJs | No | Render JavaScript before extracting (true = force JS, false = HTTP only, omit = auto-detect or use the server's render_js_default) | |
| renderer | No | Pin this request to a specific renderer. "auto" (default if omitted) uses the configured fallback chain. Other values hard-pin to a single renderer with no fallback. Pinning a non-auto value implies renderJs:true unless renderJs:false is set explicitly. | |
| url | Yes | The URL to scrape | |
| waitFor | No | Milliseconds to wait after JS rendering for late content/XHRs |