scrape_webpage
Extract and scrape webpage content using auto, simple, Scrapy, or Selenium methods. Define extraction rules or wait for specific elements to retrieve targeted data.
Instructions
Scrape a single webpage and extract its content.
This tool can scrape web pages using different methods:
auto: Automatically choose the best method
simple: Fast HTTP requests (no JavaScript)
scrapy: Robust scraping with Scrapy framework
selenium: Full browser rendering (supports JavaScript)
You can specify extraction rules to get specific data from the page.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
request | Yes |
Input Schema (JSON Schema)
{
"$defs": {
"ScrapeRequest": {
"description": "Request model for scraping operations.",
"properties": {
"extract_config": {
"anyOf": [
{
"additionalProperties": true,
"type": "object"
},
{
"type": "null"
}
],
"default": null,
"description": "Configuration for data extraction",
"title": "Extract Config"
},
"method": {
"default": "auto",
"description": "Scraping method: auto, simple, scrapy, selenium",
"title": "Method",
"type": "string"
},
"url": {
"description": "URL to scrape",
"title": "Url",
"type": "string"
},
"wait_for_element": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"description": "CSS selector to wait for (Selenium only)",
"title": "Wait For Element"
}
},
"required": [
"url"
],
"title": "ScrapeRequest",
"type": "object"
}
},
"properties": {
"request": {
"$ref": "#/$defs/ScrapeRequest",
"title": "Request"
}
},
"required": [
"request"
],
"type": "object"
}