Skip to main content
Glama

Take Screenshot

mobile_take_screenshot
Read-only

Capture screenshots from mobile devices to analyze on-screen content and identify interactive elements for mobile automation tasks.

Instructions

Take a screenshot of the mobile device. Use this to understand what's on screen, if you need to press an element that is available through view hierarchy then you must list elements on screen instead. Do not cache this result.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
deviceYesThe device identifier to use. Use mobile_list_available_devices to find which devices are available to you.

Implementation Reference

  • The complete handler and registration for the 'mobile_take_screenshot' tool. It captures a screenshot using the selected robot (device), validates the PNG, optionally resizes and compresses to JPEG using ImageMagick, encodes to base64, and returns as an image content block in the MCP tool response.
    server.tool(
    	"mobile_take_screenshot",
    	"Take a screenshot of the mobile device. Use this to understand what's on screen, if you need to press an element that is available through view hierarchy then you must list elements on screen instead. Do not cache this result.",
    	{
    		noParams
    	},
    	async ({}) => {
    		requireRobot();
    
    		try {
    			const screenSize = await robot!.getScreenSize();
    
    			let screenshot = await robot!.getScreenshot();
    			let mimeType = "image/png";
    
    			// validate we received a png, will throw exception otherwise
    			const image = new PNG(screenshot);
    			const pngSize = image.getDimensions();
    			if (pngSize.width <= 0 || pngSize.height <= 0) {
    				throw new ActionableError("Screenshot is invalid. Please try again.");
    			}
    
    			if (isImageMagickInstalled()) {
    				trace("ImageMagick is installed, resizing screenshot");
    				const image = Image.fromBuffer(screenshot);
    				const beforeSize = screenshot.length;
    				screenshot = image.resize(Math.floor(pngSize.width / screenSize.scale))
    					.jpeg({ quality: 75 })
    					.toBuffer();
    
    				const afterSize = screenshot.length;
    				trace(`Screenshot resized from ${beforeSize} bytes to ${afterSize} bytes`);
    
    				mimeType = "image/jpeg";
    			}
    
    			const screenshot64 = screenshot.toString("base64");
    			trace(`Screenshot taken: ${screenshot.length} bytes`);
    
    			return {
    				content: [{ type: "image", data: screenshot64, mimeType }]
    			};
    		} catch (err: any) {
    			error(`Error taking screenshot: ${err.message} ${err.stack}`);
    			return {
    				content: [{ type: "text", text: `Error: ${err.message}` }],
    				isError: true,
    			};
    		}
    	}
    );
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true, indicating a safe read operation, which the description aligns with by not implying any destructive action. The description adds valuable behavioral context beyond annotations by specifying 'Do not cache this result,' which informs the agent about result handling, though it doesn't detail output format or error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured, with three sentences that each serve a distinct purpose: stating the action, providing usage guidelines, and adding behavioral context. There is no redundant or unnecessary information, making it efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (one parameter, read-only), no output schema, and rich annotations, the description is largely complete. It covers purpose, usage, and a key behavioral note, though it could enhance completeness by mentioning the screenshot format or storage details, which are not critical here.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the parameter 'device' fully documented in the schema. The description does not add any additional semantic information about parameters beyond what the schema provides, such as device format or constraints, so it meets the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('take a screenshot') and resource ('mobile device'), specifying the action precisely. It distinguishes from sibling tools like 'mobile_list_elements_on_screen' by emphasizing visual capture rather than element listing, making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('to understand what's on screen') and when not to use it ('if you need to press an element... then you must list elements on screen instead'), directly naming the alternative tool 'list elements on screen' for interaction purposes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/EmpathySlainLovers/MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server