extract-document
Extract structured data from documents using a prompt for precise information retrieval. Supports URLs or base64-encoded files and optional JSON output.
Instructions
Extract structured data from documents based on a prompt.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| files | Yes | Array of URLs or base64-encoded documents | |
| inputMethod | Yes | Input method | |
| jsonMode | No | Return in JSON format | |
| prompt | Yes | Extraction prompt |
Implementation Reference
- src/index.ts:666-687 (handler)Handler function for 'extract-document' tool that proxies requests to the external Dumpling AI API endpoint /api/v1/extract-document, handling authentication and error checking.const apiKey = process.env.DUMPLING_API_KEY; if (!apiKey) throw new Error("DUMPLING_API_KEY not set"); const response = await fetch(`${NWS_API_BASE}/api/v1/extract-document`, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify({ inputMethod, files, prompt, jsonMode, requestSource: "mcp", }), }); if (!response.ok) throw new Error(`Failed: ${response.status} ${await response.text()}`); const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] }; } );
- src/index.ts:658-665 (schema)Input schema using Zod validation for the 'extract-document' tool parameters.inputMethod: z.enum(["url", "base64"]).describe("Input method"), files: z .array(z.string()) .describe("Array of URLs or base64-encoded documents"), prompt: z.string().describe("Extraction prompt"), jsonMode: z.boolean().optional().describe("Return in JSON format"), }, async ({ inputMethod, files, prompt, jsonMode }) => {
- src/index.ts:655-688 (registration)Registration of the 'extract-document' tool using McpServer.tool() method, including name, description, input schema, and handler."extract-document", "Extract structured data from documents based on a prompt.", { inputMethod: z.enum(["url", "base64"]).describe("Input method"), files: z .array(z.string()) .describe("Array of URLs or base64-encoded documents"), prompt: z.string().describe("Extraction prompt"), jsonMode: z.boolean().optional().describe("Return in JSON format"), }, async ({ inputMethod, files, prompt, jsonMode }) => { const apiKey = process.env.DUMPLING_API_KEY; if (!apiKey) throw new Error("DUMPLING_API_KEY not set"); const response = await fetch(`${NWS_API_BASE}/api/v1/extract-document`, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify({ inputMethod, files, prompt, jsonMode, requestSource: "mcp", }), }); if (!response.ok) throw new Error(`Failed: ${response.status} ${await response.text()}`); const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] }; } );