one_scrape
Extract webpage content with precision using advanced options like custom actions, specific HTML tags, and multiple formats such as markdown, HTML, and screenshots. Ideal for structured data extraction and dynamic content handling.
Instructions
Scrape a single webpage with advanced options for content extraction. Supports various formats including markdown, HTML, and screenshots. Can execute custom actions like clicking or scrolling before scraping.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
actions | No | List of actions to perform before scraping | |
excludeTags | No | HTML tags to exclude from extraction | |
extract | No | Configuration for structured data extraction | |
formats | No | Content formats to extract (default: ['markdown']) | |
includeTags | No | HTML tags to specifically include in extraction | |
location | No | Location settings for scraping | |
mobile | No | Use mobile viewport | |
onlyMainContent | No | Extract only the main content, filtering out navigation, footers, etc. | |
removeBase64Images | No | Remove base64 encoded images from output | |
skipTlsVerification | No | Skip TLS certificate verification | |
timeout | No | Maximum time in milliseconds to wait for the page to load | |
url | Yes | The URL to scrape | |
waitFor | No | Time in milliseconds to wait for dynamic content to load |