Skip to main content
Glama
DevEnterpriseSoftware

ScrAPI MCP Server

Scrape URL and respond with HTML

scrape_url_html

Extract website HTML content by scraping URLs that are blocked by bot detection, captchas, or geolocation restrictions for advanced parsing needs.

Instructions

Use a URL to scrape a website using the ScrAPI service and retrieve the result as HTML. Use this for scraping website content that is difficult to access because of bot detection, captchas or even geolocation restrictions. The result will be in HTML which is preferable if advanced parsing is required.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to scrape

Implementation Reference

  • index.ts:89-167 (handler)
    Core handler function that performs the actual scraping by calling the ScrAPI service with the provided URL and format (HTML for this tool), handles API key config, errors, and returns the HTML content as text with mime type.
    async function scrapeUrl(
      url: string,
      format: "HTML" | "Markdown"
    ): Promise<CallToolResult> {
      var body = {
        url: url,
        useBrowser: true,
        solveCaptchas: true,
        acceptDialogs: true,
        proxyType: "Residential",
        responseFormat: format,
      };
    
      try {
        const response = await fetch("https://api.scrapi.tech/v1/scrape", {
          method: "POST",
          headers: {
            "User-Agent": `${SCRAPI_SERVER_NAME} - ${SCRAPI_SERVER_VERSION}`,
            "Content-Type": "application/json",
            "X-API-KEY": config.scrapiApiKey || SCRAPI_API_KEY,
          },
          body: JSON.stringify(body),
          signal: AbortSignal.timeout(30000),
        });
    
        const data = await response.text();
    
        if (response.ok) {
          return {
            content: [
              {
                type: "text" as const,
                text: data,
                _meta: {
                  mimeType: `text/${format.toLowerCase()}`,
                },
              },
            ],
          };
        }
    
        return {
          content: [
            {
              type: "text" as const,
              text: data,
            },
          ],
          isError: true,
        };
      } catch (error) {
        console.error("Error calling API:", error);
      }
    
      const response = await fetch("https://api.scrapi.tech/v1/scrape", {
        method: "POST",
        headers: {
          "User-Agent": `${SCRAPI_SERVER_NAME} - ${SCRAPI_SERVER_VERSION}`,
          "Content-Type": "application/json",
          "X-API-KEY": SCRAPI_API_KEY,
        },
        body: JSON.stringify(body),
        signal: AbortSignal.timeout(30000),
      });
    
      const data = await response.text();
    
      return {
        content: [
          {
            type: "text",
            text: data,
            _meta: {
              mimeType: `text/${format.toLowerCase()}`,
            },
          },
        ],
      };
    }
  • Input schema defining the 'url' parameter as a valid URL string.
    inputSchema: {
      url: z
        .string()
        .url({ message: "Invalid URL" })
        .describe("The URL to scrape"),
    },
  • index.ts:53-69 (registration)
    Tool registration call that defines the name, metadata, input schema, and delegates to the scrapeUrl handler with HTML format.
    server.registerTool(
      "scrape_url_html",
      {
        title: "Scrape URL and respond with HTML",
        description:
          "Use a URL to scrape a website using the ScrAPI service and retrieve the result as HTML. " +
          "Use this for scraping website content that is difficult to access because of bot detection, captchas or even geolocation restrictions. " +
          "The result will be in HTML which is preferable if advanced parsing is required.",
        inputSchema: {
          url: z
            .string()
            .url({ message: "Invalid URL" })
            .describe("The URL to scrape"),
        },
      },
      async ({ url }) => await scrapeUrl(url, "HTML")
    );
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the service used ('ScrAPI service') and the problem domain (bypassing restrictions), but doesn't cover important behavioral aspects like rate limits, authentication requirements, error handling, or response time expectations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences that each serve distinct purposes: the first states the core functionality, the second provides usage context. While concise, the second sentence could be slightly more streamlined.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no annotations and no output schema, the description provides adequate context about what the tool does and when to use it, but lacks details about the HTML output structure, error conditions, or service limitations that would be helpful for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100% with the single 'url' parameter well-documented in the schema. The description doesn't add any meaningful parameter semantics beyond what's already in the schema, so it meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('scrape a website'), resource ('URL'), and output format ('retrieve the result as HTML'). It specifically distinguishes this tool from its sibling 'scrape_url_markdown' by emphasizing HTML output for advanced parsing needs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('for scraping website content that is difficult to access because of bot detection, captchas or even geolocation restrictions') and implicitly suggests an alternative (the sibling tool for markdown output when HTML parsing isn't required).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DevEnterpriseSoftware/scrapi-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server