Extract structured data from a URL
extractExtract structured data from any public web page by providing a URL and a JSON schema; returns only matching fields as validated, typed JSON, with missing values as null.
Instructions
Extract structured data from a public web page and return it as validated, typed JSON. Extracto renders the page (JavaScript included), runs a schema-constrained extraction, and returns ONLY fields that match the schema. Missing data comes back as null rather than a hallucinated guess. Best for a single known URL. This call is synchronous (up to ~90s); for heavy or anti-bot pages prefer extract_async.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The public HTTPS URL to extract from. | |
| schema | Yes | An object mapping each field name to a type. A type is one of the literals "string", "number", "boolean", "array", "object"; OR a one-element array for a list (e.g. ["string"] for a list of strings, or [{ "title": "string", "price": "number" }] for a list of objects); OR a nested object (e.g. { "author": { "name": "string" } }). Use the most specific shape you can. Example: { "title": "string", "price": "number", "tags": ["string"], "reviews": [{ "user": "string", "stars": "number" }] }. | |
| examples | No | Up to 3 few-shot examples to anchor the output format. |