extract
Extract structured data from web pages by defining a schema and providing a URL, using AI to process and organize information from online sources.
Instructions
Extract structured data from web pages using AI-powered instructions.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to extract from | |
| schema | Yes | Schema defining the data to extract |
Implementation Reference
- src/index.ts:423-437 (handler)The handler function for the 'extract' tool. It fetches the API key, makes a POST request to the Dumpling AI extract endpoint with the url and schema, handles errors, and returns the JSON response as text content.async ({ url, schema }) => { const apiKey = process.env.DUMPLING_API_KEY; if (!apiKey) throw new Error("DUMPLING_API_KEY not set"); const response = await fetch(`${NWS_API_BASE}/api/v1/extract`, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify({ url, schema }), }); if (!response.ok) throw new Error(`Failed: ${response.status} ${await response.text()}`); const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] };
- src/index.ts:419-422 (schema)Input schema for the 'extract' tool using Zod, defining 'url' as a URL string and 'schema' as a record of any type.{ url: z.string().url().describe("URL to extract from"), schema: z.record(z.any()).describe("Schema defining the data to extract"), },
- src/index.ts:415-439 (registration)Full registration of the 'extract' tool on the MCP server, including name, description, schema, and inline handler.// Tool to extract structured data from web pages server.tool( "extract", "Extract structured data from web pages using AI-powered instructions.", { url: z.string().url().describe("URL to extract from"), schema: z.record(z.any()).describe("Schema defining the data to extract"), }, async ({ url, schema }) => { const apiKey = process.env.DUMPLING_API_KEY; if (!apiKey) throw new Error("DUMPLING_API_KEY not set"); const response = await fetch(`${NWS_API_BASE}/api/v1/extract`, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify({ url, schema }), }); if (!response.ok) throw new Error(`Failed: ${response.status} ${await response.text()}`); const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] }; } );