Skip to main content
Glama

screen-mcp

A FastMCP server that runs on the client machine and exposes screenshot tools to a host MCP. It supports both direct screenshot capture and session-based chunked transfers so an LLM can consume images reliably.

Official FastMCP documentation: gofastmcp.com/getting-started/welcome

Exposed tools

  • list_monitors: returns detected monitors (index and dimensions)

  • capture_screenshot: captures a screen image with hybrid mode (base64 for non-vision, native MCP image for vision)

  • capture_timeline: captures a timed screen sequence (ordered frames with timestamps)

  • start_timeline_capture: starts a timeline session and returns a timeline_id

  • get_timeline_manifest: returns chunked timeline metadata

  • get_timeline_chunk: retrieves a timeline JSON chunk

  • release_timeline_capture: explicitly releases a timeline session

  • start_screenshot_capture: starts a screenshot session and returns a capture_id

  • get_screenshot_manifest: returns metadata plus ASCII preview for non-vision LLMs

  • get_screenshot_chunk: returns a chunk of base64 image data

  • release_screenshot_capture: releases the screenshot session and frees memory

Quick tool guidance

  • Need available monitor info: list_monitors

  • Need a fast single screenshot with moderate payload: capture_screenshot

  • Need a more robust single screenshot with chunking: start_screenshot_capture -> get_screenshot_manifest -> get_screenshot_chunk (0..N-1) -> release_screenshot_capture

  • Need a short timeline in one call: capture_timeline

  • Need a robust timeline for large payloads: start_timeline_capture -> get_timeline_manifest -> get_timeline_chunk (0..N-1) -> release_timeline_capture

Best practices:

  • Always concatenate chunks in ascending chunk_index order.

  • Always call release_* after reading session data to free memory.

  • For non-vision models, consume preview_text from the manifest before loading full payload.

Prerequisites

  • Linux with an active graphical session (X11/Wayland capture support)

  • DISPLAY environment variable available to the server process (mss requires it on Linux)

  • Python 3.10+

Local installation

uv sync

Or via Taskfile:

task setup

Run the MCP server (stdio)

task server

This task starts the server using mcpm run screen-mcp through uvx. It also registers or updates the local MCP server automatically when needed. Display-related environment variables are propagated during registration: DISPLAY, WAYLAND_DISPLAY, XAUTHORITY, XDG_RUNTIME_DIR.

MCP-compatible smoke-test client

task client

The smoke-test script is located in scripts/smoke_client.py and exercises:

  • list_monitors

  • start_screenshot_capture

  • get_screenshot_manifest

  • get_screenshot_chunk

  • release_screenshot_capture

It writes a verification image to artifacts/smoke_capture.jpg.

You can also run a specific action via --action:

uv run python scripts/smoke_client.py --action list-monitors
uv run python scripts/smoke_client.py --action capture-screenshot --monitor-index 0 --output artifacts/capture.jpg
uv run python scripts/smoke_client.py --action capture-timeline --duration-seconds 6 --output artifacts/timeline.json
uv run python scripts/smoke_client.py --action capture-timeline-session --duration-seconds 6 --chunk-size 120000 --output artifacts/timeline_session.json

Debugging and real-time inspection

task inspector

This launches the MCP Inspector against the mcpm run screen-mcp server.

Using the server in VS Code

  1. Open this project folder in VS Code.

  2. Add a servers configuration.

  3. Create a .vscode/mcp.json file and add one of the examples below.

Recommended local example for a cloned repo (unpublished package):

{
  "servers": {
    "screen-mcp": {
      "type": "stdio",
      "command": "uv",
      "args": ["run", "--project", "/absolute/path/to/screen-mcp", "screen-mcp"]
    }
  }
}

Example for running directly from a Git repo without global installation:

{
  "servers": {
    "screen-mcp": {
      "type": "stdio",
      "command": "uvx",
      "args": ["--from", "git+https://github.com/<owner>/screen-mcp.git", "screen-mcp"]
    }
  }
}

Alternative via MCPM:

{
  "servers": {
    "screen-mcp": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcpm", "run", "screen-mcp"]
    }
  }
}

Example tool calls

  • list_monitors()

  • capture_screenshot(monitor_index=0, image_format="jpeg", max_width=1600, quality=80)

  • capture_screenshot(monitor_index=0, image_format="jpeg", max_width=1600, quality=80, response_mode="image")

  • capture_timeline(duration_seconds=10, monitor_index=0, image_format="jpeg", max_width=900, quality=70)

  • start_timeline_capture(duration_seconds=10, monitor_index=0, image_format="jpeg", max_width=900, quality=70, chunk_size=120000)

  • get_timeline_manifest(timeline_id)

  • get_timeline_chunk(timeline_id, chunk_index)

  • release_timeline_capture(timeline_id)

Timeline behavior in capture_timeline:

  • fixed cadence: TIMELINE_FPS (default 2 images/s, configurable in source)

  • maximum duration: TIMELINE_MAX_DURATION_SECONDS (default 30s, configurable in source)

  • each frame includes: frame_index, t_offset_ms, captured_at, preview_text, image_sha256, image_size_bytes

  • temporal_hint makes chronological order explicit for an LLM

Robust flow recommendation:

  1. start_screenshot_capture(...) -> obtain capture_id

  2. get_screenshot_manifest(capture_id) -> metadata + preview_text

  3. get_screenshot_chunk(capture_id, chunk_index) -> reassemble chunks

  4. release_screenshot_capture(capture_id)

Base64 notes

  • For multi-client MCP, base64 is the most interoperable format: simple, JSON-friendly, compatible with vision and non-vision clients.

  • Tradeoff: larger payload (~33%) and risk of single-block truncation.

  • This project uses session-based chunked base64 transfer (capture_id) to make large exchanges reliable.

  • For non-vision LLMs, prefer get_screenshot_manifest (metadata + ASCII preview) before downloading the full image.

Hybrid mode in capture_screenshot:

  • response_mode="base64" (default): legacy behavior, JSON output with image_base64.

  • response_mode="image": native MCP image output for vision models, with metadata in structured_content.

  • response_mode="auto": reads SCREEN_MCP_CAPTURE_RESPONSE_MODE (base64 or image) and chooses automatically based on the client/host.

Security and privacy

Screen captures may contain sensitive data. Add an explicit client-side policy for production use (consent, masking, window whitelisting, etc.).

-
security - not tested
F
license - not found
-
quality - not tested

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jeandelest/screen-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server