Skip to main content
Glama

vnc_screenshot

Capture a screenshot of the remote screen via VNC. Optionally add a delay up to 300 seconds to wait for processes to complete before capturing.

Instructions

Take a screenshot of the current screen

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
delayNoDelay in milliseconds before taking screenshot (useful for waiting for processes to complete)

Implementation Reference

  • Main handler function for the 'vnc_screenshot' tool. Accepts an optional delay parameter, connects via VNC, requests a framebuffer update, handles pixel format conversion, and calls captureScreenshotWithDimensions to produce a JPEG base64-encoded image.
    export async function handleScreenshot(
      vncManager: VncConnectionManager,
      args: { delay?: number } = {}
    ) {
      const delay = args.delay || 0;
      if (delay > 0) {
        if (delay > 300000) { // Max 5 minutes
          throw new Error('Delay cannot exceed 300000ms (5 minutes)');
        }
        console.error(`Waiting ${delay}ms before taking screenshot...`);
        await new Promise(resolve => setTimeout(resolve, delay));
      }
    
      return vncManager.executeWithConnection(async (client) => {
        const width = client.clientWidth || 0;
        const height = client.clientHeight || 0;
        
        if (!width || !height) {
          throw new Error(`Invalid screen dimensions: ${width}x${height}`);
        }
        
        // Try to get a fresh framebuffer, but fall back to existing one if event doesn't fire
        let framebuffer: Buffer | null = null;
        
        try {
          // Request full frame update first
          client.requestFrameUpdate(true, 0, 0, width, height);
          
          // Wait for frame update event with shorter timeout
          framebuffer = await new Promise<Buffer>((resolve, reject) => {
            let timeoutId: NodeJS.Timeout | null = null;
    
            const frameUpdateHandler = (fb: Buffer) => {
              if (timeoutId) {
                clearTimeout(timeoutId);
              }
              resolve(fb);
            };
    
            client.once('frameUpdated', frameUpdateHandler);
    
            timeoutId = setTimeout(() => {
              client.removeListener('frameUpdated', frameUpdateHandler);
              reject(new Error('Frame update timeout'));
            }, 2000); // Shorter timeout
          });
        } catch (error) {
          console.warn('Frame update failed, using existing framebuffer:', error);
          // Fall back to existing framebuffer
          framebuffer = client.fb;
        }
        
        if (!framebuffer) {
          throw new Error('No framebuffer available');
        }
    
        // Log pixel format for debugging
        const pixelFormat = client.pixelFormat;
        console.error(`VNC Pixel Format: bpp=${pixelFormat.bitsPerPixel}, depth=${pixelFormat.depth}, trueColor=${pixelFormat.trueColorFlag}, bigEndian=${pixelFormat.bigEndianFlag}`);
        console.error(`Color shifts: R=${pixelFormat.redShift}, G=${pixelFormat.greenShift}, B=${pixelFormat.blueShift}`);
        console.error(`Color max: R=${pixelFormat.redMax}, G=${pixelFormat.greenMax}, B=${pixelFormat.blueMax}`);
    
        // Handle different pixel formats if VNC client didn't convert properly
        const actualBytesPerPixel = framebuffer.length / (width * height);
        console.error(`Framebuffer analysis: ${framebuffer.length} bytes for ${width}x${height} = ${actualBytesPerPixel} bytes/pixel`);
        
        if (actualBytesPerPixel !== 4) {
          console.error(`Converting from ${actualBytesPerPixel * 8}-bit format to RGBA...`);
          framebuffer = convertToRGBA(framebuffer, width, height, pixelFormat);
        }
    
        // Validate final framebuffer size
        const expectedBufferSize = width * height * 4; // RGBA = 4 bytes per pixel
        if (framebuffer.length !== expectedBufferSize) {
          console.error(`CRITICAL: Framebuffer size mismatch after conversion. Expected: ${expectedBufferSize} for ${width}x${height}, Got: ${framebuffer.length}`);
          throw new Error(`Framebuffer size mismatch: expected ${expectedBufferSize}, got ${framebuffer.length}`);
        }
    
        return captureScreenshotWithDimensions(width, height, framebuffer, delay);
      });
    }
  • Helper function that converts the raw RGBA framebuffer to a JPEG image using sharp, with optional resizing if the image exceeds 800KB, and returns the result as a base64-encoded data URI.
    export async function captureScreenshotWithDimensions(
      width: number, 
      height: number, 
      framebuffer: Buffer, 
      delay: number
    ) {
      // The framebuffer from VNC should be in RGBA format (4 bytes per pixel)
      // However, some VNC servers may have format conversion issues
      
      // Validate buffer is divisible by expected pixel size
      const pixelCount = width * height;
      const bytesPerPixel = framebuffer.length / pixelCount;
      
      if (bytesPerPixel !== 4) {
        throw new Error(`Invalid bytes per pixel: expected 4 (RGBA), got ${bytesPerPixel}. This indicates a VNC pixel format conversion problem.`);
      }
      
      // Additional validation: check for obviously corrupted data patterns
      if (hasCorruptionPatterns(framebuffer, width, height)) {
        console.warn('Warning: Framebuffer may contain corrupted data patterns, but proceeding with conversion...');
      }
      
      // Convert to compressed JPEG for smaller file size
      // For screenshots, JPEG compression is usually acceptable
      const imageBuffer = await sharp(framebuffer, {
        raw: {
          width: width,
          height: height,
          channels: 4 // RGBA
        }
      })
      .jpeg({
        quality: 80, // Good balance of quality vs size
        progressive: true
      })
      .toBuffer();
    
      // If still too large, resize down
      let finalBuffer = imageBuffer;
      let finalWidth = width;
      let finalHeight = height;
      
      if (imageBuffer.length > 800000) { // If > 800KB
        console.error(`Image too large (${imageBuffer.length} bytes), resizing...`);
        const scaleFactor = Math.sqrt(800000 / imageBuffer.length);
        finalWidth = Math.floor(width * scaleFactor);
        finalHeight = Math.floor(height * scaleFactor);
        
        finalBuffer = await sharp(framebuffer, {
          raw: {
            width: width,
            height: height,
            channels: 4
          }
        })
        .resize(finalWidth, finalHeight)
        .jpeg({
          quality: 75
        })
        .toBuffer();
      }
    
      const base64Data = finalBuffer.toString('base64');
      
      const delayText = delay > 0 ? ` (after ${delay}ms delay)` : '';
      const sizeInfo = finalWidth !== width ? ` (resized from ${width}x${height})` : '';
      
      console.error(`Final image: ${finalBuffer.length} bytes, ${finalWidth}x${finalHeight}`);
      
      return {
        content: [
          { 
            type: 'text', 
            text: `Screenshot captured (${finalWidth}x${finalHeight})${sizeInfo}${delayText}` 
          },
          {
            type: 'image',
            data: base64Data,
            mimeType: 'image/jpeg'
          }
        ]
      };
    }
  • Schema registration for the 'vnc_screenshot' tool. Defines input as an object with an optional 'delay' property (number, 0-300000ms, default 0).
    {
      name: 'vnc_screenshot',
      description: 'Take a screenshot of the current screen',
      inputSchema: {
        type: 'object',
        properties: {
          delay: { 
            type: 'number', 
            description: 'Delay in milliseconds before taking screenshot (useful for waiting for processes to complete)',
            minimum: 0,
            maximum: 300000,
            default: 0
          }
        }
      }
    }
  • src/server.ts:147-148 (registration)
    Routes the 'vnc_screenshot' tool call to the handleScreenshot function via a switch statement in the CallToolRequestSchema handler.
    case 'vnc_screenshot':
      return await handleScreenshot(this.vncManager, args as any);
  • Converts BGRX (non-standard 32-bit pixel format) to standard RGBA using the pixel format's color shift and max values.
    function convertBGRXToRGBA(buffer: Buffer, width: number, height: number, pixelFormat: any): Buffer {
      const pixelCount = width * height;
      const targetBuffer = Buffer.alloc(pixelCount * 4);
      
      console.error(`Converting with shifts R=${pixelFormat.redShift}, G=${pixelFormat.greenShift}, B=${pixelFormat.blueShift}`);
      console.error(`Color max values R=${pixelFormat.redMax}, G=${pixelFormat.greenMax}, B=${pixelFormat.blueMax}`);
      
      for (let i = 0; i < pixelCount; i++) {
        const srcOffset = i * 4;
        const dstOffset = i * 4;
        
        // Read 32-bit pixel value (little-endian)
        const pixel32 = buffer.readUInt32LE(srcOffset);
        
        // Extract color components based on shifts and max values
        let r, g, b;
        
        if (pixelFormat.redMax === 65280) { // 0xFF00 - high byte only
          r = (pixel32 >> (pixelFormat.redShift + 8)) & 0xFF;
          g = (pixel32 >> (pixelFormat.greenShift + 8)) & 0xFF;
          b = (pixel32 >> (pixelFormat.blueShift + 8)) & 0xFF;
        } else {
          // Standard extraction
          r = (pixel32 >> pixelFormat.redShift) & 0xFF;
          g = (pixel32 >> pixelFormat.greenShift) & 0xFF;
          b = (pixel32 >> pixelFormat.blueShift) & 0xFF;
        }
        
        // Write as RGBA
        targetBuffer[dstOffset] = r;     // R
        targetBuffer[dstOffset + 1] = g; // G
        targetBuffer[dstOffset + 2] = b; // B
        targetBuffer[dstOffset + 3] = 255; // A (fully opaque)
      }
      
      return targetBuffer;
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the burden, but it only states 'Take a screenshot of the current screen'. It does not disclose behavior such as whether the screenshot is returned or saved, if it captures all monitors, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence with no wasted words, but it is too terse and omits important context. It could be improved by front-loading purpose and adding brief usage hints.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one optional parameter, no output schema), the description is incomplete. It fails to mention what happens after taking the screenshot (e.g., returns an image file or base64 data), leaving the agent without critical information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the schema already describes the delay parameter. The description adds no extra meaning to the parameter beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: taking a screenshot. It uses a specific verb 'Take' and resource 'screenshot', and it is distinct from sibling tools (e.g., vnc_click).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives. The description does not mention use cases, prerequisites, or situations where other tools would be more appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hrrrsn/mcp-vnc'

If you have feedback or need assistance with the MCP directory API, please join our Discord server