fetch_webpage
Retrieve text content from web pages with options for resource blocking, pagination, authentication, and custom JavaScript execution.
Instructions
Retrieve text content from a web page
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The URL of the webpage to fetch | |
| blockResources | No | Whether to block images, stylesheets, and fonts to improve performance (default: true) | |
| resourceTypesToBlock | No | List of resource types to block (e.g., "image", "stylesheet", "font") | |
| timeout | No | Navigation timeout in milliseconds (default: 120000) | |
| maxLength | No | Maximum number of characters to return (default: 10000). | |
| startIndex | Yes | Start character index for content extraction (required; default: 0). | |
| headers | No | Custom headers to include in the request | |
| username | No | Username for basic authentication | |
| password | No | Password for basic authentication | |
| nextPageSelector | No | CSS selector for next page button/link (for auto-pagination, optional) | |
| maxPages | No | Maximum number of pages to crawl (for auto-pagination, optional, default: 1) | |
| evaluateScript | No | JavaScript code to execute on the page after loading. The result of the script will be returned. |