fetch_url
Retrieve web page content from a specified URL using a headless browser, with options to extract main content, disable media, and manage timeouts for efficient data collection.
Instructions
Retrieve web page content from a specified URL
Input Schema
Name | Required | Description | Default |
---|---|---|---|
debug | No | Whether to enable debug mode (showing browser window), overrides the --debug command line flag if specified | |
disableMedia | No | Whether to disable media resources (images, stylesheets, fonts, media), default is true | |
extractContent | No | Whether to intelligently extract the main content, default is true | |
maxLength | No | Maximum length of returned content (in characters), default is no limit | |
navigationTimeout | No | Maximum time to wait for additional navigation in milliseconds, default is 10000 (10 seconds) | |
returnHtml | No | Whether to return HTML content instead of Markdown, default is false | |
timeout | No | Page loading timeout in milliseconds, default is 30000 (30 seconds) | |
url | Yes | URL to fetch | |
waitForNavigation | No | Whether to wait for additional navigation after initial page load (useful for sites with anti-bot verification), default is false | |
waitUntil | No | Specifies when navigation is considered complete, options: 'load', 'domcontentloaded', 'networkidle', 'commit', default is 'load' |
Input Schema (JSON Schema)
{
"properties": {
"debug": {
"description": "Whether to enable debug mode (showing browser window), overrides the --debug command line flag if specified",
"type": "boolean"
},
"disableMedia": {
"description": "Whether to disable media resources (images, stylesheets, fonts, media), default is true",
"type": "boolean"
},
"extractContent": {
"description": "Whether to intelligently extract the main content, default is true",
"type": "boolean"
},
"maxLength": {
"description": "Maximum length of returned content (in characters), default is no limit",
"type": "number"
},
"navigationTimeout": {
"description": "Maximum time to wait for additional navigation in milliseconds, default is 10000 (10 seconds)",
"type": "number"
},
"returnHtml": {
"description": "Whether to return HTML content instead of Markdown, default is false",
"type": "boolean"
},
"timeout": {
"description": "Page loading timeout in milliseconds, default is 30000 (30 seconds)",
"type": "number"
},
"url": {
"description": "URL to fetch",
"type": "string"
},
"waitForNavigation": {
"description": "Whether to wait for additional navigation after initial page load (useful for sites with anti-bot verification), default is false",
"type": "boolean"
},
"waitUntil": {
"description": "Specifies when navigation is considered complete, options: 'load', 'domcontentloaded', 'networkidle', 'commit', default is 'load'",
"type": "string"
}
},
"required": [
"url"
],
"type": "object"
}