one_extract
Extract structured data from web pages using LLM prompts and JSON schemas to organize information from URLs.
Instructions
Extract structured information from web pages using LLM. Supports both cloud AI and self-hosted LLM extraction.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| urls | Yes | List of URLs to extract information from | |
| prompt | No | Prompt for the LLM extraction | |
| systemPrompt | No | System prompt for LLM extraction | |
| schema | No | JSON schema for structured data extraction | |
| allowExternalLinks | No | Allow extraction from external links | |
| enableWebSearch | No | Enable web search for additional context | |
| includeSubdomains | No | Include subdomains in extraction |
Implementation Reference
- src/tools.ts:255-295 (schema)Defines the Tool object for 'one_extract' including name, description, and input schema.export const EXTRACT_TOOL: Tool = { name: 'one_extract', description: 'Extract structured information from web pages using LLM. ' + 'Supports both cloud AI and self-hosted LLM extraction.', inputSchema: { type: 'object', properties: { urls: { type: 'array', items: { type: 'string' }, description: 'List of URLs to extract information from', }, prompt: { type: 'string', description: 'Prompt for the LLM extraction', }, systemPrompt: { type: 'string', description: 'System prompt for LLM extraction', }, schema: { type: 'object', description: 'JSON schema for structured data extraction', }, allowExternalLinks: { type: 'boolean', description: 'Allow extraction from external links', }, enableWebSearch: { type: 'boolean', description: 'Enable web search for additional context', }, includeSubdomains: { type: 'boolean', description: 'Include subdomains in extraction', }, }, required: ['urls'], }, };
- src/index.ts:66-73 (registration)Registers EXTRACT_TOOL (one_extract) in the list of available tools for ListToolsRequest.server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ SEARCH_TOOL, EXTRACT_TOOL, SCRAPE_TOOL, MAP_TOOL, ], }));