Skip to main content
Glama
lfzds4399-cpu

claude-screen-mcp

claude-screen-mcp

An MCP server that exposes read-only screen state. Capture, OCR, and change detection only — no input control.

License: MIT Node MCP CI

Quick start

git clone https://github.com/lfzds4399-cpu/claude-screen-mcp
cd claude-screen-mcp
npm install
npm run build

claude mcp add screen -- node "$(pwd)/dist/index.js"

Restart the MCP host after registration.

Tools

Tool

Purpose

screenshot

Capture a full display and resize the image result.

screenshot_region

Capture a rectangular region.

list_displays

Enumerate connected displays.

list_windows

List visible top-level windows with optional title filter.

read_screen_text

Run OCR on the full display or a region.

find_text_on_screen

Search OCR text and return matching bounding boxes.

screenshot_if_changed

Capture only when perceptual-hash distance exceeds a threshold.

get_screen_diff

Return hash-distance diagnostics without an image.

wait_for_change

Poll until the screen changes or a timeout elapses.

record_screen

Sample a short interval and return deduplicated keyframes.

Platform support

Windows 10+ is the primary target and is exercised on every release. macOS 11+ and Linux (X11 and Wayland) pass CI and smoke tests but are not regularly exercised. Window enumeration on macOS and Linux requires platform tooling; multi-monitor display enumeration is supported on Windows.

Security and privacy

All processing is local. No screenshot, OCR text, or telemetry leaves the machine; the only network call is the initial Tesseract language data download.

OCR output is untrusted input. Text rendered on screen may attempt to influence the model. Treat output as user-supplied data and avoid auto-executing commands derived from it. Scope read_screen_text to a window when full-desktop capture is not required.

Configuration

Variable

Default

Purpose

SCREEN_MCP_LOG_LEVEL

info

debug, info, warn, or error.

SCREEN_MCP_OCR_LANGS

eng+chi_sim

Tesseract language list (allowlist enforced).

The first OCR call downloads language data (~10MB per language); subsequent calls reuse the local cache.

Development

npm install
npm run build
npm test
node tests/e2e-wire.mjs

Roadmap

  • screenshot_window(title) for direct single-window capture.

  • Improved multi-display enumeration on macOS and Linux.

License — MIT, see LICENSE.

A
license - permissive license
-
quality - not tested
B
maintenance

Maintenance

Maintainers
Response time
0dRelease cycle
4Releases (12mo)

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/lfzds4399-cpu/claude-screen-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server