Skip to main content
Glama

scrape_url_markdown

Scrape website content bypassing bot detection, captchas, or geolocation restrictions and retrieve the output in Markdown format for simplified text extraction.

Instructions

Use a URL to scrape a website using the ScrAPI service and retrieve the result as Markdown. Use this for scraping website content that is difficult to access because of bot detection, captchas or even geolocation restrictions. The result will be in Markdown which is preferable if the text content of the webpage is important and not the structural information of the page.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Implementation Reference

  • index.ts:71-87 (registration)
    Registration of the scrape_url_markdown tool with input schema, description, and handler that delegates to scrapeUrl helper.
    server.registerTool( "scrape_url_markdown", { title: "Scrape URL and respond with Markdown", description: "Use a URL to scrape a website using the ScrAPI service and retrieve the result as Markdown. " + "Use this for scraping website content that is difficult to access because of bot detection, captchas or even geolocation restrictions. " + "The result will be in Markdown which is preferable if the text content of the webpage is important and not the structural information of the page.", inputSchema: { url: z .string() .url({ message: "Invalid URL" }) .describe("The URL to scrape"), }, }, async ({ url }) => await scrapeUrl(url, "Markdown") );
  • index.ts:89-163 (handler)
    The scrapeUrl helper function implements the core tool logic: scrapes the given URL using ScrAPI with browser emulation, captcha solving, residential proxy, returns markdown/HTML content as CallToolResult, with fallback API key on error.
    async function scrapeUrl( url: string, format: "HTML" | "Markdown" ): Promise<CallToolResult> { var body = { url: url, useBrowser: true, solveCaptchas: true, acceptDialogs: true, proxyType: "Residential", responseFormat: format, }; try { const response = await fetch("https://api.scrapi.tech/v1/scrape", { method: "POST", headers: { "User-Agent": `${SCRAPI_SERVER_NAME} - ${SCRAPI_SERVER_VERSION}`, "Content-Type": "application/json", "X-API-KEY": config.scrapiApiKey || SCRAPI_API_KEY, }, body: JSON.stringify(body), signal: AbortSignal.timeout(30000), }); const data = await response.text(); if (response.ok) { return { content: [ { type: "text" as const, mimeType: `text/${format.toLowerCase()}`, text: data, }, ], }; } return { content: [ { type: "text" as const, text: data, }, ], isError: true, }; } catch (error) { console.error("Error calling API:", error); } const response = await fetch("https://api.scrapi.tech/v1/scrape", { method: "POST", headers: { "User-Agent": `${SCRAPI_SERVER_NAME} - ${SCRAPI_SERVER_VERSION}`, "Content-Type": "application/json", "X-API-KEY": SCRAPI_API_KEY, }, body: JSON.stringify(body), signal: AbortSignal.timeout(30000), }); const data = await response.text(); return { content: [ { type: "text", mimeType: `text/${format.toLowerCase()}`, text: data, }, ], }; }
  • Input schema definition using Zod: requires a single 'url' parameter that is a valid URL.
    inputSchema: { url: z .string() .url({ message: "Invalid URL" }) .describe("The URL to scrape"), },

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DevEnterpriseSoftware/scrapi-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server