one_extract
Extract structured data from web pages using LLM. Define inputs via URLs, prompts, and JSON schema. Works with cloud AI or self-hosted LLM for customizable and precise web content extraction.
Instructions
Extract structured information from web pages using LLM. Supports both cloud AI and self-hosted LLM extraction.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
allowExternalLinks | No | Allow extraction from external links | |
enableWebSearch | No | Enable web search for additional context | |
includeSubdomains | No | Include subdomains in extraction | |
prompt | No | Prompt for the LLM extraction | |
schema | No | JSON schema for structured data extraction | |
systemPrompt | No | System prompt for LLM extraction | |
urls | Yes | List of URLs to extract information from |
Input Schema (JSON Schema)
{
"properties": {
"allowExternalLinks": {
"description": "Allow extraction from external links",
"type": "boolean"
},
"enableWebSearch": {
"description": "Enable web search for additional context",
"type": "boolean"
},
"includeSubdomains": {
"description": "Include subdomains in extraction",
"type": "boolean"
},
"prompt": {
"description": "Prompt for the LLM extraction",
"type": "string"
},
"schema": {
"description": "JSON schema for structured data extraction",
"type": "object"
},
"systemPrompt": {
"description": "System prompt for LLM extraction",
"type": "string"
},
"urls": {
"description": "List of URLs to extract information from",
"items": {
"type": "string"
},
"type": "array"
}
},
"required": [
"urls"
],
"type": "object"
}