batch_crawl
Crawl multiple URLs simultaneously to process URL lists, compare web pages, or extract bulk data efficiently with parallel processing capabilities.
Instructions
[STATELESS] Crawl multiple URLs concurrently for efficiency. Use when: processing URL lists, comparing multiple pages, or bulk data extraction. Faster than sequential crawling. Max 5 concurrent by default. Each URL gets a fresh browser. Cannot maintain state between URLs. For persistent operations use create_session + crawl.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
bypass_cache | No | Bypass cache for all URLs | |
max_concurrent | No | Parallel request limit. Higher = faster but more resource intensive. Adjust based on server capacity and rate limits | |
remove_images | No | Remove images from output by excluding img, picture, and svg tags | |
urls | Yes | List of URLs to crawl |
Input Schema (JSON Schema)
{
"properties": {
"bypass_cache": {
"default": false,
"description": "Bypass cache for all URLs",
"type": "boolean"
},
"max_concurrent": {
"default": 5,
"description": "Parallel request limit. Higher = faster but more resource intensive. Adjust based on server capacity and rate limits",
"type": "number"
},
"remove_images": {
"default": false,
"description": "Remove images from output by excluding img, picture, and svg tags",
"type": "boolean"
},
"urls": {
"description": "List of URLs to crawl",
"items": {
"type": "string"
},
"type": "array"
}
},
"required": [
"urls"
],
"type": "object"
}