extract
Extract structured data from any URL using a JSON schema. Define the data structure you need and get organized results from web content.
Instructions
Extract structured data from a URL using a JSON schema. Costs 5 credits.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to extract data from | |
| schema | No | JSON schema defining the structure to extract | |
| prompt | No | Natural language instruction for extraction |
Implementation Reference
- src/index.ts:117-122 (handler)The handler function for the 'extract' tool that processes the URL, schema, and prompt parameters, constructs the request body, and calls the API endpoint via apiPost('/extract', body)
async ({ url, schema, prompt }) => { const body: Record<string, unknown> = { url }; if (schema) body.schema = schema; if (prompt) body.prompt = prompt; return jsonResult(await apiPost("/extract", body)); } - src/index.ts:109-123 (registration)Tool registration using server.tool() that defines the 'extract' tool name, description, input schema using Zod (url, optional schema, optional prompt), and the handler function
server.tool( "extract", "Extract structured data from a URL using a JSON schema. Costs 5 credits.", { url: z.string().describe("URL to extract data from"), schema: z.record(z.unknown()).optional().describe("JSON schema defining the structure to extract"), prompt: z.string().optional().describe("Natural language instruction for extraction"), }, async ({ url, schema, prompt }) => { const body: Record<string, unknown> = { url }; if (schema) body.schema = schema; if (prompt) body.prompt = prompt; return jsonResult(await apiPost("/extract", body)); } ); - src/index.ts:112-116 (schema)Input schema definition using Zod: url (required string), schema (optional record for JSON structure), and prompt (optional string for natural language instructions)
{ url: z.string().describe("URL to extract data from"), schema: z.record(z.unknown()).optional().describe("JSON schema defining the structure to extract"), prompt: z.string().optional().describe("Natural language instruction for extraction"), }, - src/index.ts:41-59 (helper)Helper function apiPost() that makes POST requests to the SearchClaw API with 30-second timeout, proper headers, and error handling
async function apiPost(path: string, body: Record<string, unknown>) { const controller = new AbortController(); const timeout = setTimeout(() => controller.abort(), 30000); try { const response = await fetch(`${API_BASE}${path}`, { method: "POST", headers: { ...headers, "Content-Type": "application/json" }, body: JSON.stringify(body), signal: controller.signal, }); if (!response.ok) { const text = await response.text(); throw new Error(`SearchClaw API error ${response.status}: ${text}`); } return response.json(); } finally { clearTimeout(timeout); } } - src/index.ts:61-63 (helper)Helper function jsonResult() that formats API responses into MCP content format with JSON stringification
function jsonResult(data: unknown) { return { content: [{ type: "text" as const, text: JSON.stringify(data, null, 2) }] }; }