Skip to main content
Glama

x402_scrape_url

Extract structured data from web pages including markdown content, links, tables, images, and metadata. Supports JavaScript-rendered pages with optional CSS selector waiting for dynamic content.

Instructions

Scrape a web page and return structured JSON with markdown content, links, tables, images, and metadata. Price: $0.02 USDC per scrape (paid mode) | Free test: returns fixture data.

Supports JS-rendered pages via Playwright. Optional wait_for CSS selector for async SPA content. Hard timeout: 8 seconds total (page load + selector wait combined). Without X402_PRIVATE_KEY, only the free test endpoint is available.

Returns: markdown text, extracted links, tables, images, page metadata, and success/failure status.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to scrape (http/https, max 2048 chars)
wait_forNoCSS selector to wait for before extracting (for SPAs, e.g. '.article-body')

Implementation Reference

  • Implementation of the x402_scrape_url tool which calls the scraping API with either a paid or test endpoint.
    // ─── Tool: x402_scrape_url ──────────────────────────────────────────────────
    
    server.tool(
      "x402_scrape_url",
      `Scrape a web page and return structured JSON with markdown content, links, tables, images, and metadata.
    Price: $0.02 USDC per scrape (paid mode) | Free test: returns fixture data.
    
    Supports JS-rendered pages via Playwright. Optional wait_for CSS selector for async SPA content.
    Hard timeout: 8 seconds total (page load + selector wait combined).
    Without X402_PRIVATE_KEY, only the free test endpoint is available.
    
    Returns: markdown text, extracted links, tables, images, page metadata, and success/failure status.`,
      {
        url: z.string().url().describe("URL to scrape (http/https, max 2048 chars)"),
        wait_for: z.string().max(500).optional()
          .describe("CSS selector to wait for before extracting (for SPAs, e.g. '.article-body')"),
      },
      async (params) => {
        const base = APIS.scraping.baseUrl;
        try {
          const usePaid = !!PRIVATE_KEY;
          if (usePaid) {
            const payload: Record<string, unknown> = { url: params.url };
            if (params.wait_for) payload.wait_for = params.wait_for;
            const data = await apiPost(base, "/scrape", payload, true);
            return textResult({ mode: "paid", cost: "$0.02", ...data });
          } else {
            const data = await apiGet(base, "/scrape/test");
            return textResult({
              mode: "free_test",
              note: "Free test — returns fixture data. Set X402_PRIVATE_KEY for live scraping.",
              ...data,
            });
          }
        } catch (err: any) {
          return errorResult(err.message);
        }
      }
    );

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jameswilliamwisdom/x402-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server