scrape
Extract webpage content including text, metadata, and optional markdown formatting from any URL to gather structured information for analysis or processing.
Instructions
Tool to scrape a webpage and retrieve the text and, optionally, the markdown content. It will retrieve also the JSON-LD metadata and the head metadata.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The URL of the webpage to scrape. | |
| includeMarkdown | No | Whether to include markdown content. |
Implementation Reference
- src/tools/search-tool.ts:46-53 (handler)The scrape method in SerperSearchTools class is the primary handler that executes the scrape tool logic, delegating to the SerperClient and handling errors.
async scrape(params: IScrapeParams): Promise<IScrapeResult> { try { const result = await this.serperClient.scrape(params); return result; } catch (error) { throw new Error(`SearchTool: failed to scrape. ${error}`); } } - src/index.ts:239-252 (handler)The MCP tool call handler for 'scrape' - extracts URL and includeMarkdown parameters from the request, calls searchTools.scrape(), and returns the JSON result.
case "scrape": { const url = request.params.arguments?.url as string; const includeMarkdown = request.params.arguments ?.includeMarkdown as boolean; const result = await searchTools.scrape({ url, includeMarkdown }); return { content: [ { type: "text", text: JSON.stringify(result, null, 2), }, ], }; } - src/index.ts:143-162 (registration)Registration of the 'scrape' tool with its name, description, and inputSchema defining url (required) and includeMarkdown (optional boolean) parameters.
{ name: "scrape", description: "Tool to scrape a webpage and retrieve the text and, optionally, the markdown content. It will retrieve also the JSON-LD metadata and the head metadata.", inputSchema: { type: "object", properties: { url: { type: "string", description: "The URL of the webpage to scrape.", }, includeMarkdown: { type: "boolean", description: "Whether to include markdown content.", default: false, }, }, required: ["url"], }, }, - src/types/serper.ts:50-55 (schema)IScrapeParams interface defines the input schema with url (string) and optional includeMarkdown (boolean) for the scrape tool.
* Scrape parameters for Serper API. */ export interface IScrapeParams { url: string; includeMarkdown?: boolean; } - src/types/serper.ts:122-130 (schema)IScrapeResult interface defines the output schema with text, optional markdown, metadata, jsonld, and credits fields returned by the scrape operation.
* Represents the result of a scrape operation from the Serper API. */ export interface IScrapeResult { text: string; markdown?: string; metadata?: Record<string, string>; jsonld?: Record<string, any>; credits?: number; }