Skip to main content
Glama

x402_screenshot

Capture screenshots of web pages as base64-encoded images. Use free test domains or pay per capture with USDC to screenshot any URL.

Instructions

Capture a screenshot of any URL and return it as a base64-encoded image. Price: $0.01 USDC per capture (paid mode) | Free test: example.com, example.org, httpbin.org only.

Without X402_PRIVATE_KEY, only test domains are available. With a wallet key, any URL can be captured via the paid endpoint.

Returns: base64 PNG/JPEG/WebP image data.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to capture (full URL including https://)
widthNoViewport width in pixels (default: 1280)
heightNoViewport height in pixels (default: 720)
full_pageNoCapture the full scrollable page (default: false)
formatNoImage format (default: png)png

Implementation Reference

  • The handler for 'x402_screenshot' which implements the MCP tool registration, schema definition, and the logic to call the screenshot API (with optional X402 payment handling).
    // ─── Tool: x402_screenshot ──────────────────────────────────────────────────
    
    server.tool(
      "x402_screenshot",
      `Capture a screenshot of any URL and return it as a base64-encoded image.
    Price: $0.01 USDC per capture (paid mode) | Free test: example.com, example.org, httpbin.org only.
    
    Without X402_PRIVATE_KEY, only test domains are available.
    With a wallet key, any URL can be captured via the paid endpoint.
    
    Returns: base64 PNG/JPEG/WebP image data.`,
      {
        url: z.string().url().describe("URL to capture (full URL including https://)"),
        width: z
          .number()
          .int()
          .min(320)
          .max(3840)
          .default(1280)
          .describe("Viewport width in pixels (default: 1280)"),
        height: z
          .number()
          .int()
          .min(240)
          .max(2160)
          .default(720)
          .describe("Viewport height in pixels (default: 720)"),
        full_page: z
          .boolean()
          .default(false)
          .describe("Capture the full scrollable page (default: false)"),
        format: z
          .enum(["png", "jpeg", "webp"])
          .default("png")
          .describe("Image format (default: png)"),
      },
      async (params) => {
        const base = APIS.screenshot.baseUrl;
    
        try {
          const usePaid = !!PRIVATE_KEY;
          const query = new URLSearchParams({
            url: params.url,
            width: String(params.width),
            height: String(params.height),
            full_page: String(params.full_page),
            format: params.format,
          });
    
          if (usePaid) {
            // x402 paid mode — GET /screenshot returns 402, x402-fetch handles payment
            const data = await apiGet(base, `/screenshot?${query}`, true);
            return textResult({
              mode: "paid",
              cost: "$0.01",
              ...data,
            });
          } else {
            // Free test mode — limited to example.com, example.org, httpbin.org
            const testQuery = new URLSearchParams({
              url: params.url,
              width: String(params.width),
              height: String(params.height),
            });
            const data = await apiGet(base, `/test/screenshot?${testQuery}`);
            return textResult({
              mode: "free_test",
              note: "Free test — limited to example.com, example.org, httpbin.org. Set X402_PRIVATE_KEY for any URL.",
              ...data,
            });
          }
        } catch (err: any) {
          return errorResult(err.message);
        }
      }
    );
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully reveals critical operational traits: pricing model ($0.01 USDC), authentication requirements (private key dependency), access restrictions (limited test domains without key), and output specifications (base64 image data). It omits rate limits or timeout behaviors, but covers the essential business logic.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with four high-density sentences: purpose statement, pricing tier, authentication limitation, and authenticated capability with return value. Every sentence provides essential information (especially the pricing and auth details for a paid API) with no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity as a paid authentication-required service, the description is complete. It explains what the tool does, how much it costs, how to authenticate, domain restrictions, and what it returns. With rich schema coverage (100%) and no output schema, the description adequately covers the return value ('base64 PNG/JPEG/WebP image data').

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all five parameters (url, width, height, full_page, format) including defaults and constraints. The description adds value by mentioning the output encoding (base64) and explicitly listing the format options (PNG/JPEG/WebP), but doesn't need to compensate for missing schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a specific action ('Capture a screenshot') and clearly defines the resource ('any URL') and output format ('base64-encoded image'). It effectively distinguishes from siblings like x402_scrape_url and x402_crawl_site by emphasizing visual capture rather than text extraction or multi-page crawling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear operational context regarding authentication prerequisites ('Without X402_PRIVATE_KEY, only test domains are available') and cost constraints ('$0.01 USDC per capture'). While it doesn't explicitly name sibling alternatives, it establishes clear boundaries for when the tool is usable (test mode vs. paid mode).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jameswilliamwisdom/x402-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server