Skip to main content
Glama
Lokii0911
by Lokii0911

screenshot

Capture a PNG screenshot of the current browser view and return it as base64 encoded data for integration into automation workflows.

Instructions

Return a PNG screenshot as base64.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The 'screenshot' tool is registered as an MCP tool via @mcp.tool() decorator. The handler function calls browser.screenshot() via the _run helper.
    @mcp.tool()
    def screenshot() -> dict[str, str]:
        """Return a PNG screenshot as base64."""
        return _run("screenshot", browser.screenshot)
  • Actual implementation of the screenshot logic in BrowserManager.screenshot(). Takes a PNG screenshot via Selenium's get_screenshot_as_png() and returns it as a base64-encoded string wrapped in a ScreenshotResult model.
    def screenshot(self) -> ScreenshotResult:
        with self._lock:
            png = self._require_driver().get_screenshot_as_png()
            return ScreenshotResult(base64_png=base64.b64encode(png).decode("ascii"))
  • ScreenshotResult Pydantic model defining the return type: mime_type (default 'image/png') and base64_png (the base64-encoded screenshot data).
    class ScreenshotResult(BaseModel):
        mime_type: str = "image/png"
        base64_png: str
  • The _run helper function used by all tool handlers to execute browser actions, convert results via _as_dict, and handle exceptions.
    def _run(action: str, func: Any, *args: Any, **kwargs: Any) -> Any:
        try:
            return _as_dict(func(*args, **kwargs))
        except BrowserError:
            logger.exception("Browser action failed: %s", action)
            raise
        except Exception as exc:
            logger.exception("Unexpected Selenium MCP error during %s", action)
            raise BrowserError(f"{action} failed: {exc}") from exc
  • The 'save_screenshot' tool - a related tool that saves the screenshot to a server-local file path rather than returning base64 data.
    @mcp.tool()
    def save_screenshot(path: str) -> dict[str, str]:
        """Save a PNG screenshot to a server-local path and return the path."""
        return _run("save_screenshot", browser.save_screenshot, path)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description lacks details on what the screenshot captures (viewport vs full page), permissions, or side effects. Minimal behavioral disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, front-loaded with key action and output format, no extraneous text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No annotations or output schema details provided; description covers output format but not behavior (e.g., what part of page is captured). Adequate for zero-param tool but lacks context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters defined (0 params), so baseline score is 4. Description adds meaning by specifying output as base64-encoded PNG.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool returns a PNG screenshot as base64, which is a specific verb+resource (screenshot) with output format, distinct from sibling 'save_screenshot'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool vs alternatives like 'save_screenshot' or other browser tools; implicit context only.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Lokii0911/SeleniumMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server