scrape
Extract web data from JavaScript-rendered pages using a real browser. Output as Markdown, raw HTML, or links with pagination controls and optional residential proxy routing.
Instructions
Scrape a URL using a real browser and return page content as plain text (Markdown), raw HTML, or a list of links. Works on JavaScript-rendered pages.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The URL to scrape. | |
| type | No | Return format: 'text' for Markdown-converted content, 'html' for raw HTML, 'links' for extracted hyperlinks (default: text). | |
| pages | No | Number of pages to follow and scrape, 1–10 (default: 1). | |
| waitMs | No | Extra wait time in ms after page load (0–30000, default: 0). | |
| blockResources | No | Block images, media, and fonts to speed up scraping (default: false). | |
| locale | No | Browser locale, e.g. en-US, de-DE (default: system). | |
| premiumProxy | No | Route through a residential proxy to bypass bot detection (default: false). |