scrape_url
Extract structured data from any public URL: text, links, images, and metadata. Automatically detects when JavaScript rendering is required.
Instructions
Scrape any public URL and return structured data: text, links, images, and metadata. Automatically detects whether JavaScript rendering is needed. Use this when you need to read the content of a webpage.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The full URL to scrape (must include https://) | |
| extract_text | No | Return the visible text content of the page | |
| extract_links | No | Return all href links found on the page | |
| extract_images | No | Return all image URLs found on the page | |
| extract_metadata | No | Return page title, description, and Open Graph tags | |
| javascript | No | Force JavaScript rendering via headless browser. Use for SPAs or pages that require JS to load content. | |
| proxy_country | No | ISO 3166-1 alpha-2 country code to geo-target the scrape (e.g. US, GB, KE, DE) | |
| wait_for | No | CSS selector to wait for before extracting content. Useful for lazy-loaded content. | |
| timeout | No | Request timeout in milliseconds (1000–60000) |