take_system_screenshot
Capture a fullscreen, window, or region screenshot on macOS, Linux, or Windows. Save to a configurable output directory for documentation or debugging.
Instructions
Capture desktop, window, or region screenshot. Cross-platform: macOS (screencapture), Linux (maim/scrot/gnome-screenshot/etc.), Windows (PowerShell+.NET). Saves to ~/Documents/screenshots by default (configurable via SCREENSHOT_OUTPUT_DIR env var). For window mode, provide windowName (app name like "Safari") or windowId.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| mode | Yes | fullscreen=entire screen, window=specific app (requires windowName or windowId), region=coordinates | |
| windowId | No | Window ID (for window mode) | |
| windowName | No | App name like "Safari", "Firefox" (for window mode) | |
| region | No | Region {x,y,width,height} | |
| display | No | Display number | |
| includeCursor | No | Include cursor | |
| format | No | Image format (png or jpg) | |
| delay | No | Delay seconds | |
| outputPath | No | Absolute path, or relative to home dir |
Implementation Reference
- Main tool registration and handler function. Defines the 'take_system_screenshot' tool with Zod inputSchema (mode, windowId, windowName, region, display, includeCursor, format, delay, outputPath). The handler gets the platform-specific screenshot provider and delegates to captureFullscreen, captureWindow, or captureRegion based on mode.
export function registerTakeSystemScreenshot(server: McpServer): void { server.registerTool( 'take_system_screenshot', { description: 'Capture desktop, window, or region screenshot. Cross-platform: macOS (screencapture), Linux (maim/scrot/gnome-screenshot/etc.), Windows (PowerShell+.NET). Saves to ~/Documents/screenshots by default (configurable via SCREENSHOT_OUTPUT_DIR env var). For window mode, provide windowName (app name like "Safari") or windowId.', inputSchema: { mode: z .enum(['fullscreen', 'window', 'region']) .describe( 'fullscreen=entire screen, window=specific app (requires windowName or windowId), region=coordinates' ), windowId: z.number().int().min(0).optional().describe('Window ID (for window mode)'), windowName: z .string() .optional() .describe('App name like "Safari", "Firefox" (for window mode)'), region: z .object({ x: z.number().int().min(0), y: z.number().int().min(0), width: z.number().int().min(1).max(7680), height: z.number().int().min(1).max(4320), }) .optional() .describe('Region {x,y,width,height}'), display: z.number().int().min(1).optional().describe('Display number'), includeCursor: z.boolean().optional().describe('Include cursor'), format: z.enum(['png', 'jpg']).optional().describe('Image format (png or jpg)'), delay: z.number().min(0).max(10).optional().describe('Delay seconds'), outputPath: z .string() .optional() .describe('Absolute path, or relative to home dir'), }, }, async ({ mode, windowId, windowName, region, display, includeCursor, format, delay, outputPath: custom, }) => { ensureDefaultDirectory(); try { // Get the platform-appropriate screenshot provider const provider = await getScreenshotProvider(); const ext = format || 'png'; // Security: Validate output path (path traversal prevention) const pathValidation = await validateOutputPath( custom, `system-screenshot-${timestamp()}.${ext}`, { allowedOutputDirs: ALLOWED_OUTPUT_DIRS, defaultOutDir } ); if (!pathValidation.valid) { return err(`Output path validation failed: ${pathValidation.error}`); } const dest = pathValidation.path!; ensureDir(dest); const captureOpts = { outputPath: dest, format: format as 'png' | 'jpg' | undefined, includeCursor, delay, display, }; if (mode === 'fullscreen') { await provider.captureFullscreen(captureOpts); } else if (mode === 'window') { await provider.captureWindow({ ...captureOpts, windowId, windowName }); } else if (mode === 'region') { if (!region) return err('Region mode requires region coordinates'); await provider.captureRegion({ ...captureOpts, ...region }); } if (!existsSync(dest)) return err('Screenshot failed — file not created'); return ok(`System screenshot saved: ${dest}`); } catch (e) { return err( `System screenshot error: ${e instanceof Error ? e.message : e}` ); } } ); } - ScreenshotProvider interface defining CaptureOptions, WindowTarget, RegionTarget types and the platform-agnostic interface (captureFullscreen, captureWindow, captureRegion). Also contains the factory function getScreenshotProvider() which dispatches to MacOSProvider, LinuxProvider, or WindowsProvider based on process.platform.
export interface CaptureOptions { outputPath: string; format?: 'png' | 'jpg'; includeCursor?: boolean; delay?: number; display?: number; } export interface WindowTarget { windowId?: number; windowName?: string; } export interface RegionTarget { x: number; y: number; width: number; height: number; } // ── Provider interface ───────────────────────────────────────────────────── export interface ScreenshotProvider { /** Human-readable platform name for error messages */ readonly platform: string; /** Check whether the required capture tools are available on this system */ isAvailable(): Promise<boolean>; /** Capture the entire screen (or a specific display) */ captureFullscreen(opts: CaptureOptions): Promise<void>; /** Capture a specific window by name or ID */ captureWindow(opts: CaptureOptions & WindowTarget): Promise<void>; /** Capture a rectangular region of the screen */ captureRegion(opts: CaptureOptions & RegionTarget): Promise<void>; } - src/utils/macos-provider.ts:9-58 (helper)macOS screenshot provider using native screencapture CLI. Supports fullscreen (-x), window (-l <id>), region (-R x,y,w,h), format, cursor, delay, and display options.
export class MacOSProvider implements ScreenshotProvider { readonly platform = 'macOS'; async isAvailable(): Promise<boolean> { return commandExists('screencapture'); } async captureFullscreen(opts: CaptureOptions): Promise<void> { const args = this.buildBaseArgs(opts); args.push(opts.outputPath); await execFileAsync('screencapture', args); } async captureWindow(opts: CaptureOptions & WindowTarget): Promise<void> { let wid = opts.windowId; if (!wid && opts.windowName) { wid = (await getWindowId(opts.windowName)) ?? undefined; } if (!wid) { throw new Error( opts.windowName ? `Window not found: ${opts.windowName}` : 'Window mode requires windowId or windowName' ); } const args = this.buildBaseArgs(opts); args.push('-l', String(wid)); args.push(opts.outputPath); await execFileAsync('screencapture', args); } async captureRegion(opts: CaptureOptions & RegionTarget): Promise<void> { const args = this.buildBaseArgs(opts); args.push('-R', `${opts.x},${opts.y},${opts.width},${opts.height}`); args.push(opts.outputPath); await execFileAsync('screencapture', args); } // ── Private helpers ──────────────────────────────────────────────────── private buildBaseArgs(opts: CaptureOptions): string[] { const args = ['-x']; // no sound if (opts.includeCursor) args.push('-C'); if (opts.format) args.push('-t', opts.format); if (opts.delay && opts.delay > 0) args.push('-T', String(opts.delay)); if (opts.display) args.push('-D', String(opts.display)); return args; } } - src/utils/linux-provider.ts:15-238 (helper)Linux screenshot provider with auto-detection fallback chain (maim > scrot > gnome-screenshot > spectacle > grim > import). Uses xdotool for window-by-name resolution. Throws helpful errors for missing tools including distro-specific install commands.
export class LinuxProvider implements ScreenshotProvider { readonly platform = 'Linux'; private _backend: LinuxBackend | null = null; private _detected = false; async isAvailable(): Promise<boolean> { await this.detectBackend(); return this._backend !== null; } async captureFullscreen(opts: CaptureOptions): Promise<void> { const backend = await this.getBackend(); this.assertSupportedOptions(opts); if (opts.delay && opts.delay > 0) await sleep(opts.delay); switch (backend) { case 'gnome-screenshot': await execFileAsync('gnome-screenshot', ['-f', opts.outputPath]); break; case 'spectacle': await execFileAsync('spectacle', ['-b', '-n', '-f', '-o', opts.outputPath]); break; case 'scrot': await execFileAsync('scrot', [opts.outputPath]); break; case 'maim': await execFileAsync('maim', [opts.outputPath]); break; case 'grim': await execFileAsync('grim', [opts.outputPath]); break; case 'import': await execFileAsync('import', ['-window', 'root', opts.outputPath]); break; } } async captureWindow(opts: CaptureOptions & WindowTarget): Promise<void> { const backend = await this.getBackend(); this.assertSupportedOptions(opts); // grim cannot capture per-window on Wayland — fail fast with the // backend-specific error before attempting xdotool resolution, which // would otherwise mask the real reason with a misleading message. if (backend === 'grim') { throw new Error('Window capture is not supported on Wayland with grim. Use fullscreen or region mode.'); } if (opts.delay && opts.delay > 0) await sleep(opts.delay); // Try to resolve window ID from name using xdotool (X11 only) let xWindowId: string | undefined; if (opts.windowName && !opts.windowId) { xWindowId = await this.findXWindowId(opts.windowName); } else if (opts.windowId) { xWindowId = String(opts.windowId); } switch (backend) { case 'gnome-screenshot': // gnome-screenshot -w captures the focused window // If we have a window ID, try to focus it first via xdotool if (xWindowId) await this.focusWindow(xWindowId); await execFileAsync('gnome-screenshot', ['-w', '-f', opts.outputPath]); break; case 'spectacle': if (xWindowId) await this.focusWindow(xWindowId); await execFileAsync('spectacle', ['-b', '-n', '-a', '-o', opts.outputPath]); break; case 'scrot': if (xWindowId) await this.focusWindow(xWindowId); await execFileAsync('scrot', ['-u', opts.outputPath]); break; case 'maim': if (xWindowId) { await execFileAsync('maim', ['-i', xWindowId, opts.outputPath]); } else { // Fallback: capture focused window throw new Error('maim requires a window ID or xdotool for window-by-name capture'); } break; case 'import': if (xWindowId) { await execFileAsync('import', ['-window', xWindowId, opts.outputPath]); } else { throw new Error('import requires a window ID for window capture'); } break; } } async captureRegion(opts: CaptureOptions & RegionTarget): Promise<void> { const backend = await this.getBackend(); this.assertSupportedOptions(opts); if (opts.delay && opts.delay > 0) await sleep(opts.delay); const geometry = `${opts.width}x${opts.height}+${opts.x}+${opts.y}`; switch (backend) { case 'gnome-screenshot': // gnome-screenshot doesn't support arbitrary region coordinates // Fall back to using maim/import if available, otherwise error throw new Error( 'gnome-screenshot does not support region capture with coordinates. ' + 'Install maim or scrot for region support.' ); case 'spectacle': // spectacle --region requires interactive selection throw new Error( 'spectacle does not support non-interactive region capture. ' + 'Install maim or scrot for region support.' ); case 'scrot': // scrot -a x,y,w,h await execFileAsync('scrot', ['-a', `${opts.x},${opts.y},${opts.width},${opts.height}`, opts.outputPath]); break; case 'maim': await execFileAsync('maim', ['-g', geometry, opts.outputPath]); break; case 'grim': await execFileAsync('grim', ['-g', `${opts.x},${opts.y} ${opts.width}x${opts.height}`, opts.outputPath]); break; case 'import': await execFileAsync('import', ['-crop', geometry, '-window', 'root', opts.outputPath]); break; } } // ── Private helpers ──────────────────────────────────────────────────── private assertSupportedOptions(opts: CaptureOptions): void { if (opts.includeCursor) { throw new Error( 'Linux provider does not support includeCursor — none of the supported backends ' + '(maim, scrot, gnome-screenshot, spectacle, grim, import) expose a cursor flag in our argv. ' + 'Use macOS or Windows for cursor capture.' ); } } private async detectBackend(): Promise<void> { if (this._detected) return; this._detected = true; // Priority order: most feature-complete first const candidates: LinuxBackend[] = [ 'maim', 'scrot', 'gnome-screenshot', 'spectacle', 'grim', 'import', ]; for (const cmd of candidates) { if (await commandExists(cmd)) { this._backend = cmd; return; } } } private async getBackend(): Promise<LinuxBackend> { await this.detectBackend(); if (!this._backend) { const distro = await detectLinuxDistro(); const installCmd = getInstallCommand(distro.packageManager, ['maim', 'xdotool']); throw new Error( 'No screenshot tool found on this Linux system. ' + 'take_system_screenshot needs one of: maim, scrot, gnome-screenshot, spectacle, grim (Wayland), or ImageMagick (import). ' + `For a typical X11 install: ${installCmd}. ` + 'See README for distro-specific instructions.' ); } return this._backend; } /** * Find an X11 window ID by application name using xdotool. Throws a helpful * install hint when xdotool is missing — without it, no Linux backend can * honor a windowName request. */ private async findXWindowId(name: string): Promise<string | undefined> { if (!(await commandExists('xdotool'))) { const distro = await detectLinuxDistro(); const installCmd = getInstallCommand(distro.packageManager, ['xdotool']); throw new Error( 'Window-by-name capture requires xdotool, which is not installed. ' + `Install it with: ${installCmd}. ` + 'Alternatively, pass an explicit windowId.' ); } try { const { stdout } = await execFileAsync('xdotool', ['search', '--name', name]); const ids = stdout.trim().split('\n').filter(Boolean); return ids[0]; // Return first match } catch { return undefined; } } /** * Focus a window by X11 window ID using xdotool. */ private async focusWindow(windowId: string): Promise<void> { try { await execFileAsync('xdotool', ['windowactivate', '--sync', windowId]); // Brief pause to let the window manager bring it to front await sleep(0.3); } catch { // Non-fatal — window may still be capturable } } } - src/utils/windows-provider.ts:28-197 (helper)Windows screenshot provider using PowerShell + .NET System.Drawing. Zero external dependencies. Supports fullscreen (virtual screen or specific display), window (by name or HWND), region capture, cursor drawing, and DPI awareness.
export class WindowsProvider implements ScreenshotProvider { readonly platform = 'Windows'; private powershellPath: string | null = null; private static readonly DPI_AWARE_SNIPPET = ` Add-Type -TypeDefinition @" using System; using System.Runtime.InteropServices; public class DpiAwareness { [DllImport("user32.dll")] public static extern bool SetProcessDPIAware(); [DllImport("user32.dll", SetLastError = true)] public static extern bool SetProcessDpiAwarenessContext(IntPtr value); } "@ try { [DpiAwareness]::SetProcessDpiAwarenessContext([IntPtr]::new(-4)) | Out-Null } catch { [DpiAwareness]::SetProcessDPIAware() | Out-Null }`.trim(); async isAvailable(): Promise<boolean> { this.powershellPath = await resolvePowerShell(); return this.powershellPath !== null; } async captureFullscreen(opts: CaptureOptions): Promise<void> { if (opts.delay && opts.delay > 0) await sleep(opts.delay); const format = this.dotNetFormat(opts.format); // No display specified → capture entire virtual screen (all monitors combined) // display specified → capture that specific monitor const captureAllDisplays = opts.display === undefined || opts.display === null; const displayIndex = captureAllDisplays ? 0 : (opts.display! - 1); const boundsScript = captureAllDisplays ? ` $bounds = [System.Windows.Forms.SystemInformation]::VirtualScreen` : ` $screens = [System.Windows.Forms.Screen]::AllScreens $screen = if (${displayIndex} -lt $screens.Length) { $screens[${displayIndex}] } else { [System.Windows.Forms.Screen]::PrimaryScreen } $bounds = $screen.Bounds`; const script = ` ${WindowsProvider.DPI_AWARE_SNIPPET} Add-Type -AssemblyName System.Windows.Forms Add-Type -AssemblyName System.Drawing ${boundsScript} $bitmap = New-Object System.Drawing.Bitmap($bounds.Width, $bounds.Height) $graphics = [System.Drawing.Graphics]::FromImage($bitmap) ${opts.includeCursor ? '$cursorPos = [System.Windows.Forms.Cursor]::Position\n' : ''} $graphics.CopyFromScreen($bounds.Location, [System.Drawing.Point]::Empty, $bounds.Size) ${opts.includeCursor ? this.cursorDrawScript() : ''} $bitmap.Save('${this.escapePath(opts.outputPath)}', [System.Drawing.Imaging.ImageFormat]::${format}) $graphics.Dispose() $bitmap.Dispose() `.trim(); await this.runPowerShell(script); } async captureWindow(opts: CaptureOptions & WindowTarget): Promise<void> { if (opts.delay && opts.delay > 0) await sleep(opts.delay); if (!opts.windowName && !opts.windowId) { throw new Error('Window mode requires windowName or windowId'); } const format = this.dotNetFormat(opts.format); // Use windowName to find the process, or windowId as HWND const findWindowScript = opts.windowName ? ` $proc = Get-Process | Where-Object { $_.MainWindowTitle -like '*${this.escapeString(opts.windowName)}*' -and $_.MainWindowHandle -ne 0 } | Select-Object -First 1 if (-not $proc) { throw "Window not found: ${this.escapeString(opts.windowName)}" } $hwnd = $proc.MainWindowHandle ` : `$hwnd = [IntPtr]::new(${opts.windowId})`; const script = ` ${WindowsProvider.DPI_AWARE_SNIPPET} Add-Type -AssemblyName System.Drawing Add-Type @" using System; using System.Runtime.InteropServices; public class Win32 { [DllImport("user32.dll")] public static extern bool GetWindowRect(IntPtr hWnd, out RECT lpRect); [DllImport("user32.dll")] public static extern bool SetForegroundWindow(IntPtr hWnd); [StructLayout(LayoutKind.Sequential)] public struct RECT { public int Left, Top, Right, Bottom; } } "@ ${findWindowScript} [Win32]::SetForegroundWindow($hwnd) | Out-Null Start-Sleep -Milliseconds 300 $rect = New-Object Win32+RECT [Win32]::GetWindowRect($hwnd, [ref]$rect) | Out-Null $width = $rect.Right - $rect.Left $height = $rect.Bottom - $rect.Top if ($width -le 0 -or $height -le 0) { throw "Invalid window dimensions" } $bitmap = New-Object System.Drawing.Bitmap($width, $height) $graphics = [System.Drawing.Graphics]::FromImage($bitmap) $graphics.CopyFromScreen($rect.Left, $rect.Top, 0, 0, (New-Object System.Drawing.Size($width, $height))) $bitmap.Save('${this.escapePath(opts.outputPath)}', [System.Drawing.Imaging.ImageFormat]::${format}) $graphics.Dispose() $bitmap.Dispose() `.trim(); await this.runPowerShell(script); } async captureRegion(opts: CaptureOptions & RegionTarget): Promise<void> { if (opts.delay && opts.delay > 0) await sleep(opts.delay); const format = this.dotNetFormat(opts.format); const script = ` ${WindowsProvider.DPI_AWARE_SNIPPET} Add-Type -AssemblyName System.Drawing $bitmap = New-Object System.Drawing.Bitmap(${opts.width}, ${opts.height}) $graphics = [System.Drawing.Graphics]::FromImage($bitmap) $graphics.CopyFromScreen(${opts.x}, ${opts.y}, 0, 0, (New-Object System.Drawing.Size(${opts.width}, ${opts.height}))) $bitmap.Save('${this.escapePath(opts.outputPath)}', [System.Drawing.Imaging.ImageFormat]::${format}) $graphics.Dispose() $bitmap.Dispose() `.trim(); await this.runPowerShell(script); } // ── Private helpers ──────────────────────────────────────────────────── private async runPowerShell(script: string): Promise<void> { const cmd = this.powershellPath ?? 'powershell'; const encoded = Buffer.from(script, 'utf16le').toString('base64'); await execFileAsync(cmd, [ '-ExecutionPolicy', 'Bypass', '-NoProfile', '-NonInteractive', '-EncodedCommand', encoded, ]); } private dotNetFormat(format?: 'png' | 'jpg'): string { switch (format) { case 'jpg': return 'Jpeg'; default: return 'Png'; } } private escapePath(p: string): string { // Escape single quotes for PowerShell string literals return p.replace(/'/g, "''"); } private escapeString(s: string): string { // Escape characters that could break PowerShell string interpolation return s.replace(/'/g, "''").replace(/[`$"]/g, ''); } private cursorDrawScript(): string { return ` try { $cursorBounds = New-Object System.Drawing.Rectangle($cursorPos.X - $bounds.X, $cursorPos.Y - $bounds.Y, 32, 32) [System.Windows.Forms.Cursors]::Default.Draw($graphics, $cursorBounds) } catch {} `; } }