scrape

scrape

Extract and parse web content in markdown, HTML, or screenshot formats from any URL, with options to clean content and render JavaScript.

Instructions

Extract and parse content from any web page.

Input Schema

TableJSON Schema

Name	Required	Description
`url`	Yes	URL to scrape
`format`	No	Output format
`cleaned`	No	Whether to clean content
`renderJs`	No	Whether to render JavaScript

Implementation Reference

src/index.ts:356-371 (handler)
The handler function for the 'scrape' tool. It sends a POST request to the external Dumpling AI API endpoint '/api/v1/scrape' with the URL and options, then returns the JSON response formatted as MCP content.
async ({ url, format, cleaned, renderJs }) => { const apiKey = process.env.DUMPLING_API_KEY; if (!apiKey) throw new Error("DUMPLING_API_KEY not set"); const response = await fetch(`${NWS_API_BASE}/api/v1/scrape`, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify({ url, format, cleaned, renderJs }), }); if (!response.ok) throw new Error(`Failed: ${response.status} ${await response.text()}`); const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] }; }
src/index.ts:343-355 (schema)
Input schema for the 'scrape' tool defined using Zod, validating the URL and optional scraping options like format, cleaned content, and JS rendering.
{ url: z.string().url().describe("URL to scrape"), format: z .enum(["markdown", "html", "screenshot"]) .optional() .describe("Output format"), cleaned: z.boolean().optional().describe("Whether to clean content"), renderJs: z .boolean() .optional() .default(true) .describe("Whether to render JavaScript"), },
src/index.ts:340-372 (registration)
Full registration of the 'scrape' tool on the MCP server, specifying name, description, input schema, and handler function.
server.tool( "scrape", "Extract and parse content from any web page.", { url: z.string().url().describe("URL to scrape"), format: z .enum(["markdown", "html", "screenshot"]) .optional() .describe("Output format"), cleaned: z.boolean().optional().describe("Whether to clean content"), renderJs: z .boolean() .optional() .default(true) .describe("Whether to render JavaScript"), }, async ({ url, format, cleaned, renderJs }) => { const apiKey = process.env.DUMPLING_API_KEY; if (!apiKey) throw new Error("DUMPLING_API_KEY not set"); const response = await fetch(`${NWS_API_BASE}/api/v1/scrape`, { method: "POST", headers: { "Content-Type": "application/json", Authorization: `Bearer ${apiKey}`, }, body: JSON.stringify({ url, format, cleaned, renderJs }), }); if (!response.ok) throw new Error(`Failed: ${response.status} ${await response.text()}`); const data = await response.json(); return { content: [{ type: "text", text: JSON.stringify(data, null, 2) }] }; } );

Dumpling AI MCP Server

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API