scrape
Extract content from any webpage as markdown, LLM-optimized text, plain text, or JSON. Automatically bypasses bot protection and handles JavaScript rendering for reliable data collection.
Instructions
Scrape a single URL and extract its content as markdown, LLM-optimized text, plain text, or full JSON. Automatically falls back to the webclaw cloud API when bot protection or JS rendering is detected.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| browser | No | Browser profile: "chrome" (default), "firefox", or "random" | |
| exclude_selectors | No | CSS selectors to exclude from output | |
| format | No | Output format: "markdown" (default), "llm", "text", or "json" | |
| include_selectors | No | CSS selectors to include (only extract matching elements) | |
| only_main_content | No | If true, extract only the main content (article/main element) | |
| url | Yes | URL to scrape |