Skip to main content
Glama
DevEnterpriseSoftware

ScrAPI MCP Server

Scrape URL and respond with Markdown

scrape_url_markdown

Extract website content as Markdown by bypassing bot detection, captchas, and geolocation restrictions using the ScrAPI service.

Instructions

Use a URL to scrape a website using the ScrAPI service and retrieve the result as Markdown. Use this for scraping website content that is difficult to access because of bot detection, captchas or even geolocation restrictions. The result will be in Markdown which is preferable if the text content of the webpage is important and not the structural information of the page.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to scrape

Implementation Reference

  • index.ts:71-87 (registration)
    Registration of the scrape_url_markdown tool, specifying its title, description, input schema (URL), and handler that calls the shared scrapeUrl helper.
    server.registerTool(
      "scrape_url_markdown",
      {
        title: "Scrape URL and respond with Markdown",
        description:
          "Use a URL to scrape a website using the ScrAPI service and retrieve the result as Markdown. " +
          "Use this for scraping website content that is difficult to access because of bot detection, captchas or even geolocation restrictions. " +
          "The result will be in Markdown which is preferable if the text content of the webpage is important and not the structural information of the page.",
        inputSchema: {
          url: z
            .string()
            .url({ message: "Invalid URL" })
            .describe("The URL to scrape"),
        },
      },
      async ({ url }) => await scrapeUrl(url, "Markdown")
    );
  • index.ts:89-167 (handler)
    Shared handler function that performs the actual URL scraping using ScrAPI, supporting both HTML and Markdown formats. Called by the tool's registered handler.
    async function scrapeUrl(
      url: string,
      format: "HTML" | "Markdown"
    ): Promise<CallToolResult> {
      var body = {
        url: url,
        useBrowser: true,
        solveCaptchas: true,
        acceptDialogs: true,
        proxyType: "Residential",
        responseFormat: format,
      };
    
      try {
        const response = await fetch("https://api.scrapi.tech/v1/scrape", {
          method: "POST",
          headers: {
            "User-Agent": `${SCRAPI_SERVER_NAME} - ${SCRAPI_SERVER_VERSION}`,
            "Content-Type": "application/json",
            "X-API-KEY": config.scrapiApiKey || SCRAPI_API_KEY,
          },
          body: JSON.stringify(body),
          signal: AbortSignal.timeout(30000),
        });
    
        const data = await response.text();
    
        if (response.ok) {
          return {
            content: [
              {
                type: "text" as const,
                text: data,
                _meta: {
                  mimeType: `text/${format.toLowerCase()}`,
                },
              },
            ],
          };
        }
    
        return {
          content: [
            {
              type: "text" as const,
              text: data,
            },
          ],
          isError: true,
        };
      } catch (error) {
        console.error("Error calling API:", error);
      }
    
      const response = await fetch("https://api.scrapi.tech/v1/scrape", {
        method: "POST",
        headers: {
          "User-Agent": `${SCRAPI_SERVER_NAME} - ${SCRAPI_SERVER_VERSION}`,
          "Content-Type": "application/json",
          "X-API-KEY": SCRAPI_API_KEY,
        },
        body: JSON.stringify(body),
        signal: AbortSignal.timeout(30000),
      });
    
      const data = await response.text();
    
      return {
        content: [
          {
            type: "text",
            text: data,
            _meta: {
              mimeType: `text/${format.toLowerCase()}`,
            },
          },
        ],
      };
    }
  • Input schema for the scrape_url_markdown tool, validating the 'url' parameter.
    inputSchema: {
      url: z
        .string()
        .url({ message: "Invalid URL" })
        .describe("The URL to scrape"),
    },
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that the tool uses ScrAPI service to handle access challenges like bot detection and geolocation restrictions, which adds useful behavioral context. However, it lacks details on rate limits, authentication needs, or error handling, leaving gaps in transparency for a tool that interacts with external services.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose and usage, consisting of two concise sentences that efficiently convey key information without redundancy. Every sentence adds value, such as explaining the service used and when to prefer this tool over alternatives, making it well-structured and appropriately sized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (external scraping service with access challenges) and lack of annotations or output schema, the description does a good job covering purpose, usage, and behavioral context. However, it does not detail the output format beyond 'Markdown' (e.g., structure or limitations), which could be improved for completeness, though it's not critical since the sibling tool implies HTML as the alternative.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the single parameter 'url' well-documented in the schema. The description does not add any additional semantic information about the parameter beyond what the schema provides, such as URL format constraints or examples. Baseline score of 3 is appropriate as the schema handles the parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('scrape a website using the ScrAPI service') and resource ('URL'), and distinguishes it from its sibling tool by specifying the output format ('retrieve the result as Markdown'). It explicitly contrasts with the sibling tool scrape_url_html by emphasizing the Markdown output for text content rather than structural information.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('for scraping website content that is difficult to access because of bot detection, captchas or even geolocation restrictions') and when it is preferable ('if the text content of the webpage is important and not the structural information of the page'), effectively differentiating it from the HTML-focused sibling tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DevEnterpriseSoftware/scrapi-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server