scrapeDeep
Extract comprehensive web content, including images, using deep scraping techniques with customizable parameters such as scroll depth, image size, and pagination. Output data to a specified directory for thorough analysis.
Instructions
Maximum extraction web scraping (slower but thorough)
Input Schema
Name | Required | Description | Default |
---|---|---|---|
downloadImages | No | Whether to download images locally | |
maxImages | No | Maximum number of images to extract | |
maxScrolls | No | Maximum number of scroll attempts (default: 20) | |
minImageSize | No | Minimum width/height for images in pixels | |
output | No | Output directory for downloaded images | |
pages | No | Number of pages to scrape (if pagination is present) | |
scrapeImages | No | Whether to include images in the scrape result | |
scrollDelay | No | Delay between scrolls in ms (default: 3000) | |
url | Yes | URL of the webpage to scrape |
Input Schema (JSON Schema)
{
"properties": {
"downloadImages": {
"description": "Whether to download images locally",
"type": "boolean"
},
"maxImages": {
"description": "Maximum number of images to extract",
"type": "number"
},
"maxScrolls": {
"description": "Maximum number of scroll attempts (default: 20)",
"type": "number"
},
"minImageSize": {
"description": "Minimum width/height for images in pixels",
"type": "number"
},
"output": {
"description": "Output directory for downloaded images",
"type": "string"
},
"pages": {
"description": "Number of pages to scrape (if pagination is present)",
"type": "number"
},
"scrapeImages": {
"description": "Whether to include images in the scrape result",
"type": "boolean"
},
"scrollDelay": {
"description": "Delay between scrolls in ms (default: 3000)",
"type": "number"
},
"url": {
"description": "URL of the webpage to scrape",
"type": "string"
}
},
"required": [
"url"
],
"type": "object"
}