screenshot
Capture a desktop screenshot, optionally cropping around the cursor. Saves to disk and returns base64 data when requested.
Instructions
Capture a single screenshot of the desktop. Persists the file to disk and optionally returns a base64 payload. Set cursorRadius>0 to crop a square region around the mouse cursor instead of the full screen.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| cursorRadius | No | If >0, crop a square of (2*radius)x(2*radius) px centered on the cursor. 0 = full screen. | |
| format | No | jpeg | |
| quality | No | ||
| maxEdge | No | Resize the longest edge to this many pixels. 0 disables resizing. Default 2400 for full screen; with cursorRadius>0 the cursor crop is kept at native resolution unless overridden. | |
| display | No | Optional display index for multi-monitor setups (omit for primary). | |
| includeBase64 | No | If true, include the image bytes inline in the response. Default true. |
Implementation Reference
- src/index.ts:37-71 (registration)Tool registration: 'screenshot' is registered in ListToolsRequestSchema with name, description, and inputSchema.
server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ { name: "screenshot", description: "Capture a single screenshot of the desktop. Persists the file to disk and " + "optionally returns a base64 payload. Set cursorRadius>0 to crop a square " + "region around the mouse cursor instead of the full screen.", inputSchema: { type: "object", properties: { cursorRadius: { type: "integer", description: "If >0, crop a square of (2*radius)x(2*radius) px centered on the cursor. 0 = full screen.", default: 0, }, format: { type: "string", enum: ["png", "jpeg", "webp"], default: "jpeg" }, quality: { type: "integer", minimum: 1, maximum: 100, default: 82 }, maxEdge: { type: "integer", description: "Resize the longest edge to this many pixels. 0 disables resizing. " + "Default 2400 for full screen; with cursorRadius>0 the cursor crop is kept at native resolution unless overridden.", }, display: { type: "integer", description: "Optional display index for multi-monitor setups (omit for primary).", }, includeBase64: { type: "boolean", description: "If true, include the image bytes inline in the response. Default true.", default: true, }, }, }, }, - src/index.ts:167-183 (handler)Handler for 'screenshot' tool call: invokes capture() and formats the result via jsonAndImage().
server.setRequestHandler(CallToolRequestSchema, async (req) => { const { name, arguments: args = {} } = req.params; try { switch (name) { case "screenshot": { const r = await capture({ outDir: DEFAULT_OUT_DIR, cursorRadius: numArg(args, "cursorRadius", 0), format: strEnum(args, "format", ["png", "jpeg", "webp"] as const, "jpeg"), quality: numArg(args, "quality", 82), maxEdge: optionalNumArg(args, "maxEdge"), display: args.display === undefined ? undefined : Number(args.display), includeBase64: args.includeBase64 !== false, }); return jsonAndImage(r); } - src/index.ts:45-70 (schema)Input schema for the screenshot tool, defining parameters: cursorRadius, format, quality, maxEdge, display, includeBase64.
inputSchema: { type: "object", properties: { cursorRadius: { type: "integer", description: "If >0, crop a square of (2*radius)x(2*radius) px centered on the cursor. 0 = full screen.", default: 0, }, format: { type: "string", enum: ["png", "jpeg", "webp"], default: "jpeg" }, quality: { type: "integer", minimum: 1, maximum: 100, default: 82 }, maxEdge: { type: "integer", description: "Resize the longest edge to this many pixels. 0 disables resizing. " + "Default 2400 for full screen; with cursorRadius>0 the cursor crop is kept at native resolution unless overridden.", }, display: { type: "integer", description: "Optional display index for multi-monitor setups (omit for primary).", }, includeBase64: { type: "boolean", description: "If true, include the image bytes inline in the response. Default true.", default: true, }, }, }, - src/capture.ts:19-45 (helper)CaptureOptions interface and CaptureResult type defining the capture input/output schema.
export interface CaptureOptions { /** Crop a square region around the current cursor position with this radius (px). 0 = full screen. */ cursorRadius?: number; /** Output format. */ format?: "png" | "jpeg" | "webp"; /** JPEG/WebP quality 1..100. */ quality?: number; /** Resize the longest edge of the image to this many pixels (preserves aspect). 0 = no resize. */ maxEdge?: number; /** Monitor display id (screenshot-desktop displays). Defaults to primary. */ display?: number; /** If false, omit base64 from the result and only persist to disk. */ includeBase64?: boolean; /** Where to write the captured image. */ outDir: string; } export interface CaptureResult { filePath: string; bytes: number; width: number; height: number; format: string; takenAt: string; cursor?: { x: number; y: number; foregroundWindow?: string; windowUnderCursor?: string }; base64?: string; } - src/capture.ts:49-111 (helper)Core capture() function that takes screenshot using screenshot-desktop or Windows PowerShell, applies cursor cropping, resizing, format conversion, and writes to disk with optional base64.
export async function capture(opts: CaptureOptions): Promise<CaptureResult> { const fmt = opts.format ?? "jpeg"; const quality = clamp(opts.quality ?? 82, 1, 100); /* If the caller cropped a region around the cursor it's already small enough * for the LLM context; don't downscale it further unless they explicitly ask * for a maxEdge. Full-screen captures default to 2400px on the long edge, * which keeps UI text legible on 4K monitors while staying token-friendly. */ const cropping = (opts.cursorRadius ?? 0) > 0; const maxEdge = opts.maxEdge ?? (cropping ? 0 : 2400); const includeBase64 = opts.includeBase64 ?? true; const rawPng: Buffer = await capturePngBackend(opts.display); let pipeline = sharp(rawPng); const meta = await pipeline.metadata(); const fullW = meta.width ?? 0; const fullH = meta.height ?? 0; let cursorMeta: CaptureResult["cursor"] | undefined; if (opts.cursorRadius && opts.cursorRadius > 0) { const ci = await getCursorInfo(); cursorMeta = ci; const r = Math.max(32, Math.floor(opts.cursorRadius)); const left = clamp(ci.x - r, 0, Math.max(0, fullW - 1)); const top = clamp(ci.y - r, 0, Math.max(0, fullH - 1)); const width = clamp(2 * r, 1, Math.max(1, fullW - left)); const height = clamp(2 * r, 1, Math.max(1, fullH - top)); pipeline = pipeline.extract({ left, top, width, height }); } if (maxEdge > 0) { pipeline = pipeline.resize({ width: maxEdge, height: maxEdge, fit: "inside", withoutEnlargement: true, }); } if (fmt === "jpeg") pipeline = pipeline.jpeg({ quality, mozjpeg: true }); else if (fmt === "webp") pipeline = pipeline.webp({ quality }); else pipeline = pipeline.png({ compressionLevel: 9 }); const out = await pipeline.toBuffer({ resolveWithObject: true }); if (!existsSync(opts.outDir)) await mkdir(opts.outDir, { recursive: true }); const stamp = new Date().toISOString().replace(/[:.]/g, "-"); const filename = `cap-${stamp}.${fmt === "jpeg" ? "jpg" : fmt}`; const filePath = path.join(opts.outDir, filename); await writeFile(filePath, out.data); const result: CaptureResult = { filePath, bytes: out.data.length, width: out.info.width, height: out.info.height, format: fmt, takenAt: new Date().toISOString(), cursor: cursorMeta, }; if (includeBase64) result.base64 = out.data.toString("base64"); return result; }