Skip to main content
Glama

viewpo_screenshot

Capture screenshots of webpages at multiple viewport widths to verify responsive design across different screen sizes.

Instructions

Capture screenshots of a URL at one or more viewport widths. Returns base64 JPEG images. Use this to SEE what a webpage looks like at different screen sizes.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to screenshot
viewportsNoViewports to capture. Defaults to desktop (1920px) if omitted. Common widths: 375 (phone), 820 (tablet), 1920 (desktop).

Implementation Reference

  • Tool registration and handler implementation for viewpo_screenshot. Defines the tool schema using zod, handles the request by calling client.screenshot(), processes the response into text and image content, and returns formatted results or error messages.
    // --- Tool: viewpo_screenshot ---
    
    server.tool(
      "viewpo_screenshot",
      "Capture screenshots of a URL at one or more viewport widths. Returns base64 JPEG images. Use this to SEE what a webpage looks like at different screen sizes.",
      {
        url: z.string().url().describe("The URL to screenshot"),
        viewports: z
          .array(
            z.object({
              width: z
                .number()
                .int()
                .min(100)
                .max(3840)
                .describe("Viewport width in CSS pixels"),
              name: z
                .string()
                .optional()
                .describe('Label for this viewport (e.g. "phone", "desktop")'),
            })
          )
          .optional()
          .describe(
            "Viewports to capture. Defaults to desktop (1920px) if omitted. Common widths: 375 (phone), 820 (tablet), 1920 (desktop)."
          ),
      },
      async ({ url, viewports }) => {
        try {
          const result = await client.screenshot({ url, viewports });
    
          const content = result.viewports.flatMap((vp) => [
            {
              type: "text" as const,
              text: `**${vp.viewport}** (${vp.width}\u00d7${vp.height})`,
            },
            {
              type: "image" as const,
              data: vp.image_base64,
              mimeType: "image/jpeg" as const,
            },
          ]);
    
          return { content };
        } catch (error) {
          return {
            content: [
              {
                type: "text" as const,
                text: `Screenshot failed: ${error instanceof Error ? error.message : String(error)}`,
              },
            ],
            isError: true,
          };
        }
      }
    );
  • Type definitions for the screenshot tool. Includes ViewportSpec (width and optional name), ScreenshotRequest (url and optional viewports array), ViewportResult (viewport dimensions and base64 image), and ScreenshotResponse (array of viewport results).
    // --- Screenshot ---
    
    export interface ViewportSpec {
      width: number;
      name?: string;
    }
    
    export interface ScreenshotRequest {
      url: string;
      viewports?: ViewportSpec[];
    }
    
    export interface ViewportResult {
      viewport: string;
      width: number;
      height: number;
      image_base64: string;
    }
    
    export interface ScreenshotResponse {
      viewports: ViewportResult[];
    }
  • HTTP client method that sends POST request to /screenshot endpoint. Takes ScreenshotRequest parameters and returns ScreenshotResponse, handling communication with the Viewpo macOS app's bridge API.
    async screenshot(request: ScreenshotRequest): Promise<ScreenshotResponse> {
      return this.post<ScreenshotResponse>("/screenshot", request);
    }
  • Zod schema validation for the viewpo_screenshot tool parameters. Defines url as a required URL string, and viewports as an optional array of objects with width (100-3840) and optional name fields.
    server.tool(
      "viewpo_screenshot",
      "Capture screenshots of a URL at one or more viewport widths. Returns base64 JPEG images. Use this to SEE what a webpage looks like at different screen sizes.",
      {
        url: z.string().url().describe("The URL to screenshot"),
        viewports: z
          .array(
            z.object({
              width: z
                .number()
                .int()
                .min(100)
                .max(3840)
                .describe("Viewport width in CSS pixels"),
              name: z
                .string()
                .optional()
                .describe('Label for this viewport (e.g. "phone", "desktop")'),
            })
          )
          .optional()
          .describe(
            "Viewports to capture. Defaults to desktop (1920px) if omitted. Common widths: 375 (phone), 820 (tablet), 1920 (desktop)."
          ),
      },
  • Generic POST method implementation that handles HTTP requests with JSON body, including authentication headers and response processing via handleResponse method.
    private async post<T>(path: string, body: unknown): Promise<T> {
      const response = await fetch(`${this.baseURL}${path}`, {
        method: "POST",
        headers: this.headers(true),
        body: JSON.stringify(body),
      });
      return this.handleResponse<T>(response);
    }
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes the core behavior (capturing screenshots and returning base64 JPEGs) but lacks details about potential side effects (e.g., network requests, rate limits, authentication needs, or error handling). The description doesn't contradict annotations, but it's incomplete for a tool with no annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded, with two sentences that efficiently convey the tool's purpose and usage. Every word earns its place, and there's no redundant or verbose language, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (capturing screenshots with configurable viewports), no annotations, and no output schema, the description is somewhat incomplete. It covers the basic purpose and output format but lacks details on behavioral aspects like error cases, performance implications, or how the output is structured. This is adequate but leaves clear gaps for an agent to fully understand the tool's behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so the schema already fully documents the parameters (url and viewports). The description adds minimal value beyond the schema by implying the tool's purpose relates to viewport widths, but it doesn't provide additional semantic context or usage examples for the parameters. This meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Capture screenshots'), the resource ('a URL at one or more viewport widths'), and the output format ('base64 JPEG images'). It distinguishes from siblings by focusing on visual capture rather than comparison or layout analysis, making the purpose immediately understandable and distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('to SEE what a webpage looks like at different screen sizes'), which implicitly suggests it's for visual inspection. However, it doesn't explicitly state when NOT to use it or name alternatives like the sibling tools, leaving some guidance gaps.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/littlebearapps/viewpo-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server