Skip to main content
Glama

screenshot

Capture a base64 PNG screenshot of a browser tab. Use for visual verification of CSS, layout, or proof. Prefer snapshot for most tasks due to higher token efficiency.

Instructions

Take visual screenshot in base64 PNG. Use ONLY for visual verification (CSS, layout, proof). Prefer snapshot for most tasks — much more token-efficient.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
tabIdYesTab ID from create_tab

Implementation Reference

  • Tool handler for 'screenshot' — calls client.screenshot() and returns base64 PNG image result.
    server.tool(
      "screenshot",
      "Take visual screenshot in base64 PNG. Use ONLY for visual verification (CSS, layout, proof). Prefer snapshot for most tasks — much more token-efficient.",
      {
        tabId: z.string().min(1).describe("Tab ID from create_tab")
      },
      async (input: unknown) => {
        try {
          const parsed = z.object({ tabId: z.string().min(1).describe("Tab ID from create_tab") }).parse(input);
          const tracked = getTrackedTab(parsed.tabId);
          const screenshotBuffer = await deps.client.screenshot(parsed.tabId, tracked.userId);
          incrementToolCall(parsed.tabId);
          return imageResult(screenshotBuffer.toString("base64"));
        } catch (error) {
          return toErrorResult(error);
        }
      }
  • Input schema for screenshot tool — requires tabId string.
    {
      tabId: z.string().min(1).describe("Tab ID from create_tab")
  • Registration of the 'screenshot' tool via server.tool()
    server.tool(
      "screenshot",
      "Take visual screenshot in base64 PNG. Use ONLY for visual verification (CSS, layout, proof). Prefer snapshot for most tasks — much more token-efficient.",
      {
        tabId: z.string().min(1).describe("Tab ID from create_tab")
      },
      async (input: unknown) => {
        try {
          const parsed = z.object({ tabId: z.string().min(1).describe("Tab ID from create_tab") }).parse(input);
          const tracked = getTrackedTab(parsed.tabId);
          const screenshotBuffer = await deps.client.screenshot(parsed.tabId, tracked.userId);
          incrementToolCall(parsed.tabId);
          return imageResult(screenshotBuffer.toString("base64"));
        } catch (error) {
          return toErrorResult(error);
        }
      }
    );
  • Client method that fetches screenshot binary from the API.
    async screenshot(tabId: string, userId: string): Promise<Buffer> {
      const binary = await this.requestBinary(
        `/tabs/${encodeURIComponent(tabId)}/screenshot?userId=${encodeURIComponent(userId)}`,
        {
        method: "GET"
        }
      );
      return Buffer.from(binary);
    }
  • Helper function to wrap base64 PNG into a ToolResult with image content type.
    export function imageResult(base64Png: string): ToolResult {
      return {
        content: [{ type: "image", data: base64Png, mimeType: "image/png" }]
      };
    }
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations; description only mentions output format (base64 PNG) but does not disclose any side effects, permissions, or limitations beyond that. Adequate but minimal.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, front-loaded with purpose, no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one parameter and no output schema, description covers purpose and usage guidance adequately, though lacks details on error conditions or prerequisites.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers tabId with description; tool description adds no extra meaning. Baseline 3 due to high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States 'Take visual screenshot in base64 PNG' clearly. Distinguishes from sibling 'snapshot' by saying to prefer snapshot for most tasks.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use ONLY for visual verification (CSS, layout, proof)' and 'Prefer snapshot for most tasks — much more token-efficient', providing clear when-to and when-not.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/redf0x1/camofox-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server