extract_entities
Extract emails, phone numbers, URLs, dates, IPs, and prices from web pages. Optionally save full entity data to a JSON file.
Instructions
Extract entities (emails, phones, etc.) from web pages. Use output_path to persist the full entity extraction output to disk as JSON and receive a slim response.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Target URL | |
| entity_types | Yes | Types: email, phone, url, date, ip, price | |
| custom_patterns | No | Custom regex patterns | |
| include_context | No | Include context | |
| deduplicate | No | Remove duplicates | |
| use_llm | No | Use LLM for NER | |
| llm_provider | No | LLM provider | |
| llm_model | No | LLM model | |
| output_path | No | Absolute file path (auto .json extension) to persist the full entity extraction as JSON. When set, the response is slimmed (content, markdown, extracted_data.raw_content removed). | |
| include_content_in_response | No | When True (with output_path set), also keep the entity data in the response. Defaults to False. | |
| overwrite | No | Overwrite an existing output file at output_path. Defaults to False. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||