Skip to main content
Glama

scrape_url_markdown

Extract website content as Markdown by bypassing bot detection, captchas, and geolocation restrictions using the ScrAPI service.

Instructions

Use a URL to scrape a website using the ScrAPI service and retrieve the result as Markdown. Use this for scraping website content that is difficult to access because of bot detection, captchas or even geolocation restrictions. The result will be in Markdown which is preferable if the text content of the webpage is important and not the structural information of the page.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to scrape

Implementation Reference

  • index.ts:71-87 (registration)
    Registration of the scrape_url_markdown tool, specifying its title, description, input schema (URL), and handler that calls the shared scrapeUrl helper.
    server.registerTool( "scrape_url_markdown", { title: "Scrape URL and respond with Markdown", description: "Use a URL to scrape a website using the ScrAPI service and retrieve the result as Markdown. " + "Use this for scraping website content that is difficult to access because of bot detection, captchas or even geolocation restrictions. " + "The result will be in Markdown which is preferable if the text content of the webpage is important and not the structural information of the page.", inputSchema: { url: z .string() .url({ message: "Invalid URL" }) .describe("The URL to scrape"), }, }, async ({ url }) => await scrapeUrl(url, "Markdown") );
  • index.ts:89-167 (handler)
    Shared handler function that performs the actual URL scraping using ScrAPI, supporting both HTML and Markdown formats. Called by the tool's registered handler.
    async function scrapeUrl( url: string, format: "HTML" | "Markdown" ): Promise<CallToolResult> { var body = { url: url, useBrowser: true, solveCaptchas: true, acceptDialogs: true, proxyType: "Residential", responseFormat: format, }; try { const response = await fetch("https://api.scrapi.tech/v1/scrape", { method: "POST", headers: { "User-Agent": `${SCRAPI_SERVER_NAME} - ${SCRAPI_SERVER_VERSION}`, "Content-Type": "application/json", "X-API-KEY": config.scrapiApiKey || SCRAPI_API_KEY, }, body: JSON.stringify(body), signal: AbortSignal.timeout(30000), }); const data = await response.text(); if (response.ok) { return { content: [ { type: "text" as const, text: data, _meta: { mimeType: `text/${format.toLowerCase()}`, }, }, ], }; } return { content: [ { type: "text" as const, text: data, }, ], isError: true, }; } catch (error) { console.error("Error calling API:", error); } const response = await fetch("https://api.scrapi.tech/v1/scrape", { method: "POST", headers: { "User-Agent": `${SCRAPI_SERVER_NAME} - ${SCRAPI_SERVER_VERSION}`, "Content-Type": "application/json", "X-API-KEY": SCRAPI_API_KEY, }, body: JSON.stringify(body), signal: AbortSignal.timeout(30000), }); const data = await response.text(); return { content: [ { type: "text", text: data, _meta: { mimeType: `text/${format.toLowerCase()}`, }, }, ], }; }
  • Input schema for the scrape_url_markdown tool, validating the 'url' parameter.
    inputSchema: { url: z .string() .url({ message: "Invalid URL" }) .describe("The URL to scrape"), },
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DevEnterpriseSoftware/scrapi-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server