Skip to main content
Glama

tauri_webview_screenshot

Read-only

Capture screenshots from Tauri app webviews during development and testing. Use this tool to document UI states, verify visual elements, or debug interface issues in running Tauri applications.

Instructions

[Tauri Apps Only] Screenshot a running Tauri app's webview. Requires active tauri_driver_session. Captures only visible viewport. Targets the only connected app, or the default app if multiple are connected. Specify appIdentifier (port or bundle ID) to target a specific app. For browser screenshots, use Chrome DevTools MCP instead. For Electron apps, this will NOT work.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
windowIdNoWindow label to target (defaults to "main")
appIdentifierNoApp port or bundle ID to target. Defaults to the only connected app or the default app if multiple are connected.
formatNoImage formatpng
qualityNoJPEG quality (0-100, only for jpeg format)
filePathNoFile path to save the screenshot to instead of returning as base64
maxWidthNoMaximum width in pixels. Images wider than this will be scaled down proportionally. Can also be set via TAURI_MCP_SCREENSHOT_MAX_WIDTH environment variable.

Implementation Reference

  • Tool registration defining name, description, schema, annotations, and wrapper handler that parses input and delegates to screenshot() function
    {
       name: 'tauri_webview_screenshot',
       description:
          '[Tauri Apps Only] Screenshot a running Tauri app\'s webview. ' +
          'Requires active tauri_driver_session. Captures only visible viewport. ' +
          MULTI_APP_DESC + ' ' +
          'For browser screenshots, use Chrome DevTools MCP instead. ' +
          'For Electron apps, this will NOT work.',
       category: TOOL_CATEGORIES.UI_AUTOMATION,
       schema: ScreenshotSchema,
       annotations: {
          title: 'Screenshot Tauri Webview',
          readOnlyHint: true,
          openWorldHint: false,
       },
       handler: async (args) => {
          const parsed = ScreenshotSchema.parse(args);
    
          const result = await screenshot({
             quality: parsed.quality,
             format: parsed.format,
             windowId: parsed.windowId,
             filePath: parsed.filePath,
             appIdentifier: parsed.appIdentifier,
             maxWidth: parsed.maxWidth,
          });
    
          // If saved to file, return text confirmation
          if ('filePath' in result) {
             return `Screenshot saved to: ${result.filePath}`;
          }
    
          // Return the content array directly for proper image handling
          return result.content;
       },
    },
  • Input schema for the tool: extends WindowTargetSchema with screenshot-specific options (format, quality, filePath, maxWidth)
    export const ScreenshotSchema = WindowTargetSchema.extend({
       format: z.enum([ 'png', 'jpeg' ]).optional().default('png').describe('Image format'),
       quality: z.number().min(0).max(100).optional().describe('JPEG quality (0-100, only for jpeg format)'),
       filePath: z.string().optional().describe('File path to save the screenshot to instead of returning as base64'),
       maxWidth: z.number().int().positive().optional().describe(
          'Maximum width in pixels. Images wider than this will be scaled down proportionally. ' +
          'Can also be set via TAURI_MCP_SCREENSHOT_MAX_WIDTH environment variable.'
       ),
    });
  • Primary handler function: optionally saves screenshot to file or returns image content; delegates core capture to captureScreenshot()
    export interface ScreenshotOptions {
       quality?: number;
       format?: 'png' | 'jpeg';
       windowId?: string;
       filePath?: string;
       appIdentifier?: string | number;
       maxWidth?: number;
    }
    
    export interface ScreenshotFileResult {
       filePath: string;
       format: 'png' | 'jpeg';
    }
    
    export async function screenshot(options: ScreenshotOptions = {}): Promise<ScreenshotResult | ScreenshotFileResult> {
       const { quality, format = 'png', windowId, filePath, appIdentifier, maxWidth } = options;
    
       // Use the native screenshot function from webview-executor
       const result = await captureScreenshot({ format, quality, windowId, appIdentifier, maxWidth });
    
       // If filePath is provided, write to file instead of returning base64
       if (filePath) {
          // Find the image content in the result
          const imageContent = result.content.find((c) => { return c.type === 'image'; });
    
          if (!imageContent || imageContent.type !== 'image') {
             throw new Error('Screenshot capture failed: no image data');
          }
    
          // Decode base64 and write to file
          const buffer = Buffer.from(imageContent.data, 'base64');
    
          const resolvedPath = resolve(filePath);
    
          await writeFile(resolvedPath, buffer);
    
          return { filePath: resolvedPath, format };
       }
    
       return result;
    }
  • Core screenshot implementation: native IPC screenshot via 'capture_native_screenshot', with fallbacks to html2canvas library and Screen Capture API; constructs ToolContent image response
    export async function captureScreenshot(options: CaptureScreenshotOptions = {}): Promise<ScreenshotResult> {
       const { format = 'png', quality = 90, windowId, appIdentifier, maxWidth } = options;
    
       // Primary implementation: Use native platform-specific APIs
       // - macOS: WKWebView takeSnapshot
       // - Windows: WebView2 CapturePreview
       // - Linux: Chromium/WebKit screenshot APIs
       try {
          // Ensure we're fully initialized
          await ensureReady();
    
          // Resolve target session
          const session = resolveTargetApp(appIdentifier);
    
          const client = session.client;
    
          // Use longer timeout (15s) for native screenshot - the Rust code waits up to 10s
          const response = await client.sendCommand({
             command: 'capture_native_screenshot',
             args: {
                format,
                quality,
                windowLabel: windowId,
                maxWidth,
             },
          }, 15000);
    
          if (!response.success || !response.data) {
             throw new Error(response.error || 'Native screenshot returned invalid data');
          }
    
          // The native command returns a base64 data URL
          const dataUrl = response.data as string;
    
          if (!dataUrl || !dataUrl.startsWith('data:image/')) {
             throw new Error('Native screenshot returned invalid data');
          }
    
          // Build response with window context
          return buildScreenshotResult(dataUrl, 'native API', response.windowContext);
       } catch(nativeError: unknown) {
          // Log the native error for debugging, then fall back
          const nativeMsg = nativeError instanceof Error ? nativeError.message : String(nativeError);
    
          driverLogger.error(`Native screenshot failed: ${nativeMsg}, falling back to html2canvas`);
       }
    
       // Fallback 1: Use html2canvas library for high-quality DOM rendering
       // Try to use the script manager to register html2canvas for persistence
       const html2canvasScript = await prepareHtml2canvasScript(format, quality);
    
       // Fallback: Try Screen Capture API if available
       // Note: This script is wrapped by executeAsyncInWebview, so we don't need an IIFE
       const screenCaptureScript = `
          // Check if Screen Capture API is available
          if (!navigator.mediaDevices || !navigator.mediaDevices.getDisplayMedia) {
             throw new Error('Screen Capture API not available');
          }
    
          // Request screen capture permission and get the stream
          const stream = await navigator.mediaDevices.getDisplayMedia({
             video: {
                displaySurface: 'window',
                cursor: 'never'
             },
             audio: false
          });
    
          // Get the video track
          const videoTrack = stream.getVideoTracks()[0];
          if (!videoTrack) {
             throw new Error('No video track available');
          }
    
          // Create a video element to display the stream
          const video = document.createElement('video');
          video.srcObject = stream;
          video.autoplay = true;
    
          // Wait for the video to load metadata
          await new Promise((resolve, reject) => {
             video.onloadedmetadata = resolve;
             video.onerror = reject;
             setTimeout(() => reject(new Error('Video load timeout')), 5000);
          });
    
          // Play the video
          await video.play();
    
          // Create canvas to capture the frame
          const canvas = document.createElement('canvas');
          const ctx = canvas.getContext('2d');
    
          // Set canvas dimensions to match video
          canvas.width = video.videoWidth;
          canvas.height = video.videoHeight;
    
          // Draw the video frame to canvas
          ctx.drawImage(video, 0, 0, canvas.width, canvas.height);
    
          // Stop all tracks to release the capture
          stream.getTracks().forEach(track => track.stop());
    
          // Convert to data URL with specified format and quality
          const mimeType = '${format}' === 'jpeg' ? 'image/jpeg' : 'image/png';
          return canvas.toDataURL(mimeType, ${quality / 100});
       `;
    
       try {
          // Try html2canvas second (after native APIs)
          const result = await executeAsyncInWebview(html2canvasScript, undefined, 10000); // Longer timeout for library loading
    
          // Validate that we got a real data URL, not 'null' or empty
          if (result && result !== 'null' && result.startsWith('data:image/')) {
             return buildScreenshotResult(result, 'html2canvas');
          }
    
          throw new Error(`html2canvas returned invalid result: ${result?.substring(0, 100) || 'null'}`);
       } catch(html2canvasError: unknown) {
          try {
             // Fallback to Screen Capture API
             const result = await executeAsyncInWebview(screenCaptureScript);
    
             // Validate that we got a real data URL
             if (result && result.startsWith('data:image/')) {
                return buildScreenshotResult(result, 'Screen Capture API');
             }
    
             throw new Error(`Screen Capture API returned invalid result: ${result?.substring(0, 50) || 'null'}`);
          } catch(screenCaptureError: unknown) {
             // All methods failed - throw a proper error
             const html2canvasMsg = html2canvasError instanceof Error ? html2canvasError.message : 'html2canvas failed';
    
             const screenCaptureMsg = screenCaptureError instanceof Error ? screenCaptureError.message : 'Screen Capture API failed';
    
             throw new Error(
                'Screenshot capture failed. Native API not available, ' +
                `html2canvas error: ${html2canvasMsg}, ` +
                `Screen Capture API error: ${screenCaptureMsg}`
             );
          }
       }
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond the annotations: it specifies the prerequisite of an active tauri_driver_session, notes that it captures only the visible viewport, and explains targeting behavior for multiple connected apps. While annotations indicate readOnlyHint=true and openWorldHint=false, the description provides practical constraints without contradicting them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose and key constraints, followed by specific usage notes. Each sentence adds distinct value: scope, prerequisites, capture behavior, targeting rules, and exclusions. There is no redundant or wasted information, making it highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, no output schema) and rich annotations, the description is largely complete. It covers scope, prerequisites, behavioral traits, and exclusions. However, it doesn't detail the return format (e.g., base64 vs. file save) or error handling, leaving minor gaps for a tool with no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents all 6 parameters thoroughly. The description adds minimal parameter semantics, only briefly mentioning appIdentifier for targeting specifics. It doesn't provide additional syntax or format details beyond the schema, so the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Screenshot a running Tauri app's webview') and resource ('Tauri app's webview'), distinguishing it from siblings like tauri_webview_execute_js or tauri_webview_interact. It explicitly mentions the Tauri-only scope, which differentiates it from browser or Electron alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('[Tauri Apps Only]', 'Requires active tauri_driver_session') and when not to use it ('For browser screenshots, use Chrome DevTools MCP instead. For Electron apps, this will NOT work'). It also mentions targeting specifics for multiple connected apps, offering clear alternatives and exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hypothesi/mcp-server-tauri'

If you have feedback or need assistance with the MCP directory API, please join our Discord server