extract-data
Extract structured data from websites using custom schemas and prompts to convert web content into organized formats for analysis and processing.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| urls | Yes | ||
| prompt | Yes | ||
| schema | Yes |
Implementation Reference
- src/server.js:143-196 (handler)The handler function for the "extract-data" tool. It dynamically creates a Zod schema, calls Firecrawl's extract API with the URLs, prompt, and schema, handles errors with Sentry, and returns the extracted data as JSON or error message.async ({ urls, prompt, schema }) => { return await Sentry.startSpan( { name: "extract-data" }, async () => { try { // Add Sentry breadcrumb for debugging Sentry.addBreadcrumb({ category: 'extract-data', message: `Extracting data from URLs`, data: { urlCount: urls.length, prompt }, level: 'info' }); // Create the Zod schema from the provided definition const zodSchema = createDynamicSchema(schema); // Extract data using Firecrawl const extractResponse = await firecrawl.extract(urls, { prompt: prompt, schema: zodSchema }); if (!extractResponse.success) { // Capture error in Sentry Sentry.captureMessage(`Failed to extract data: ${extractResponse.error}`, 'error'); return { content: [{ type: "text", text: `Failed to extract data: ${extractResponse.error}` }], isError: true }; } return { content: [{ type: "text", text: JSON.stringify(extractResponse.data, null, 2) }] }; } catch (error) { // Capture exception in Sentry Sentry.captureException(error); return { content: [{ type: "text", text: `Error extracting data: ${error.message}` }], isError: true }; } } ); }
- src/server.js:132-142 (schema)Input schema (Zod) for the "extract-data" tool defining urls (array of valid URLs), prompt (string), and schema (record supporting string, boolean, number, array, object types).{ urls: z.array(z.string().url()), prompt: z.string(), schema: z.record(z.union([ z.literal('string'), z.literal('boolean'), z.literal('number'), z.array(z.any()), z.record(z.any()) ])) },
- src/server.js:130-131 (registration)Registration of the "extract-data" tool using McpServer's tool() method with the specified name.server.tool( "extract-data",
- src/server.js:35-59 (helper)Helper function used by the extract-data handler to recursively build a Zod schema from the user-provided schema definition supporting primitive types, arrays, and objects.function createDynamicSchema(schemaDefinition) { const schemaMap = { string: z.string(), boolean: z.boolean(), number: z.number(), array: (itemType) => z.array(createDynamicSchema(itemType)), object: (properties) => { const shape = {}; for (const [key, type] of Object.entries(properties)) { shape[key] = createDynamicSchema(type); } return z.object(shape); } }; if (typeof schemaDefinition === 'string') { return schemaMap[schemaDefinition]; } else if (Array.isArray(schemaDefinition)) { return schemaMap.array(schemaDefinition[0]); } else if (typeof schemaDefinition === 'object') { return schemaMap.object(schemaDefinition); } throw new Error(`Unsupported schema type: ${typeof schemaDefinition}`); }