extract
Pull structured data from web pages by defining a schema with AI-powered instructions. Ideal for scraping and organizing targeted information efficiently.
Instructions
Extract structured data from web pages using AI-powered instructions.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| schema | Yes | Schema defining the data to extract | |
| url | Yes | URL to extract from |
Implementation Reference
- src/index.ts:423-438 (handler)The handler function that executes the 'extract' tool logic by sending a POST request to the external Dumpling AI API endpoint '/api/v1/extract' with the provided URL and schema.async ({ url, schema }) => { const apiKey = process.env.DUMPLING_API_KEY; if (!apiKey) throw new Error("DUMPLING_API_KEY not set"); const response = await fetch(`${NWS_API_BASE}/api/v1/extract`, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify({ url, schema }), }); if (!response.ok) throw new Error(`Failed: ${response.status} ${await response.text()}`); const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] }; }
- src/index.ts:419-422 (schema)The input schema for the 'extract' tool, defining parameters 'url' (required URL string) and 'schema' (record of any for extraction schema).{ url: z.string().url().describe("URL to extract from"), schema: z.record(z.any()).describe("Schema defining the data to extract"), },
- src/index.ts:416-439 (registration)The registration of the 'extract' tool using server.tool(), including name, description, schema, and inline handler function.server.tool( "extract", "Extract structured data from web pages using AI-powered instructions.", { url: z.string().url().describe("URL to extract from"), schema: z.record(z.any()).describe("Schema defining the data to extract"), }, async ({ url, schema }) => { const apiKey = process.env.DUMPLING_API_KEY; if (!apiKey) throw new Error("DUMPLING_API_KEY not set"); const response = await fetch(`${NWS_API_BASE}/api/v1/extract`, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify({ url, schema }), }); if (!response.ok) throw new Error(`Failed: ${response.status} ${await response.text()}`); const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] }; } );