extract_url
Extract structured data from web pages by providing a URL and describing what you need. Returns clean JSON with specified fields, handling JavaScript-rendered and Cloudflare-protected sites automatically.
Instructions
Extract structured data from any web page by providing a URL and describing what you want. Returns clean JSON with exactly the fields you asked for — no HTML parsing needed. Handles JavaScript-rendered pages and Cloudflare-protected sites automatically. This is the general-purpose extraction tool. Use extract_article for full article content or extract_metadata for page meta tags — they are optimised shortcuts. Read-only — makes no changes to any external system. Requires HAUNT_API_KEY environment variable. Free tier: 100 requests/month. Returns an error if rate limit or API key is invalid.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The full URL of the page to extract data from. Must be a valid HTTP or HTTPS URL. Supports any public web page including JavaScript-heavy SPAs and Cloudflare-protected sites. | |
| prompt | Yes | A plain-English description of what data to extract. Be specific about which fields you want. Examples: "product name, price, and availability", "all email addresses and phone numbers", "the main heading and first paragraph". |