fetch_urls
Extract and retrieve web page content from multiple URLs using a headless browser, with options for timeout, content extraction, and HTML/Markdown formatting.
Instructions
Retrieve web page content from multiple specified URLs
Input Schema
Name | Required | Description | Default |
---|---|---|---|
debug | No | Whether to enable debug mode (showing browser window), overrides the --debug command line flag if specified | |
disableMedia | No | Whether to disable media resources (images, stylesheets, fonts, media), default is true | |
extractContent | No | Whether to intelligently extract the main content, default is true | |
maxLength | No | Maximum length of returned content (in characters), default is no limit | |
navigationTimeout | No | Maximum time to wait for additional navigation in milliseconds, default is 10000 (10 seconds) | |
returnHtml | No | Whether to return HTML content instead of Markdown, default is false | |
timeout | No | Page loading timeout in milliseconds, default is 30000 (30 seconds) | |
urls | Yes | Array of URLs to fetch | |
waitForNavigation | No | Whether to wait for additional navigation after initial page load (useful for sites with anti-bot verification), default is false | |
waitUntil | No | Specifies when navigation is considered complete, options: 'load', 'domcontentloaded', 'networkidle', 'commit', default is 'load' |
Input Schema (JSON Schema)
{
"properties": {
"debug": {
"description": "Whether to enable debug mode (showing browser window), overrides the --debug command line flag if specified",
"type": "boolean"
},
"disableMedia": {
"description": "Whether to disable media resources (images, stylesheets, fonts, media), default is true",
"type": "boolean"
},
"extractContent": {
"description": "Whether to intelligently extract the main content, default is true",
"type": "boolean"
},
"maxLength": {
"description": "Maximum length of returned content (in characters), default is no limit",
"type": "number"
},
"navigationTimeout": {
"description": "Maximum time to wait for additional navigation in milliseconds, default is 10000 (10 seconds)",
"type": "number"
},
"returnHtml": {
"description": "Whether to return HTML content instead of Markdown, default is false",
"type": "boolean"
},
"timeout": {
"description": "Page loading timeout in milliseconds, default is 30000 (30 seconds)",
"type": "number"
},
"urls": {
"description": "Array of URLs to fetch",
"items": {
"type": "string"
},
"type": "array"
},
"waitForNavigation": {
"description": "Whether to wait for additional navigation after initial page load (useful for sites with anti-bot verification), default is false",
"type": "boolean"
},
"waitUntil": {
"description": "Specifies when navigation is considered complete, options: 'load', 'domcontentloaded', 'networkidle', 'commit', default is 'load'",
"type": "string"
}
},
"required": [
"urls"
],
"type": "object"
}