webextrator_render
Render web pages with JavaScript execution and return fully rendered HTML. Capture dynamic content from single-page applications and JavaScript-heavy sites.
Instructions
Render a web page and return the fully rendered HTML.
Uses a headless browser to navigate to the specified URL, waits for JavaScript
to execute, and returns the final rendered HTML source.
Use this when:
- You need the fully rendered HTML of a JavaScript-heavy page
- You want to inspect the DOM after dynamic content has loaded
- You need to capture single-page application (SPA) content
Returns:
JSON response containing the rendered HTML content.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The URL of the web page to render. Required. | |
| delay | No | Extra delay in seconds after page load before capturing HTML. | |
| headers | No | Extra HTTP headers to include with the page request. | |
| timeout | No | Total timeout in seconds for page load. Default is 30. | |
| user_agent | No | Override the User-Agent header for the page request. | |
| wait_until | No | Page load wait condition before capturing HTML. Options: 'load', 'domcontentloaded', 'networkidle', 'commit'. Default is 'networkidle'. | |
| callback_url | No | Callback URL for async processing. If provided, the task runs asynchronously and results are sent to this URL when complete. | |
| block_resources | No | Resource types to block during page load to speed up rendering. Options: 'image', 'font', 'media', 'stylesheet', 'xhr', 'fetch'. | |
| wait_for_selector | No | CSS selector to wait for before capturing HTML. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |