realhands
Allows controlling the real computer to interact with Google Play Console for app deployment tasks such as uploading apps, adding release notes, and submitting releases.
Allows controlling the real computer to interact with YouTube, e.g., searching and playing videos.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@realhandscheck my latest email"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
computer-use-mcp ("realhands")
An MCP server that lets Claude operate your real computer the way a human does — moving the actual mouse, clicking, typing, and reading the actual screen.
Unlike OpenAI Operator, browser-use, or Playwright agents (which spin up a separate, isolated, logged-out Chrome), this drives the physical OS cursor and keyboard. So it works in your own Chrome with your own logged-in sessions — and in every other app — because it's just a human at the keyboard, as far as any website can tell.
Status: LIVE and battle-tested. Registered with Claude Code as the user-scope MCP
realhands (tool mcp__realhands__computer) and ✓Connected since 2026-06-02.
On 2026-06-09 it drove the user's real, logged-in Chrome through a complete Google
Play Console deployment (app upload, release notes, submission) end-to-end.
The server is registered as
realhandsrather thancomputer-usebecause the name "computer-use" is reserved in Claude Code.
How it works
Claude (Desktop or Code) is the agent loop. You type a task; Claude calls the single
computer tool in a see → think → act cycle:
See —
screenshotreturns the real screen (downscaled to ~1280px for grounding accuracy) → Think — Claude picks the next action + pixel coordinates → Act — the server moves the real mouse / types on the real keyboard → a fresh screenshot comes back automatically after every action, and it repeats.
Two Windows-specific details make clicks land accurately (src/screen.py):
DPI awareness —
SetProcessDpiAwareness(2)is set at import time so screenshot pixels == pyautogui cursor coordinates even under display scaling (125% / 150% / …).Stateless coordinate scaling — screenshots are downscaled (LANCZOS) to at most
COMPUTER_USE_MAX_DIMon the longest side before sending; incoming click coordinates are scaled back up to real pixels. The scale factor is a pure function of monitor geometry +MAX_DIM, so mapping never depends on which screenshot ran last. Coordinates are clamped inside the target monitor so a stray click can't fly off-screen.
Multi-monitor: every call takes an optional monitor index (1 = primary, 2.. =
others, 0 = the whole virtual desktop). action="monitors" enumerates the setup.
Origins may be negative for screens left/above the primary — to_real() handles the
offset. Use the same monitor for a click as for the screenshot you're clicking on.
Related MCP server: MCP Desktop Tools
Architecture
src/
server.py FastMCP server "computer-use"; the single `computer` tool (action enum
modeled on Anthropic's reference computer_20250124 tool); returns
status text + a fresh screenshot after every action
screen.py DPI awareness, mss capture, downscale, model-space -> real-pixel mapping
input.py pyautogui mouse/keyboard execution; xdotool-style key-name translation
(Return, Page_Down, ctrl+a, super, ...); clipboard-paste fast path for
long/Unicode/multiline typing (preserves your existing clipboard);
activate_window via win32 AttachThreadInput
safety.py kill switches + lazy arm / stand-down lifecycle
config.py .env-driven configuration (all defaults are sensible; .env is optional)
install.py one-shot installer: venv, deps, .env, Claude Desktop registrationStack: Python 3.10/3.11 · mcp (FastMCP, stdio) · pyautogui · mss · pillow ·
pynput · keyboard · pyperclip · python-dotenv — plus pygetwindow and pywin32
for activate_window.
The computer tool
A single tool with an action parameter:
Action | What it does |
| Capture the screen (always start a task with this) |
| Report the real mouse position |
| List detected monitors (for multi-screen setups) |
| Glide the cursor to |
| Click at |
| Drag from |
| Press / release the left button |
| Scroll at |
| Type |
| Press a key or chord — |
| Hold keys for |
| Bring an app to the front by title substring (beats Windows' foreground-lock; far more reliable than clicking the taskbar) |
| Sleep |
| Stand down: close the STOP overlay + release the panic hotkey (call as the final action) |
Coordinates are in the pixel space of the most recent screenshot; its size is reported with every capture. After every non-screenshot action the tool waits ~0.4s for the UI to settle and returns a fresh screenshot.
Safety — it controls your REAL machine
This is fully autonomous: it does not ask before each action. Three independent
kill switches (src/safety.py):
Fail-safe corner — slam the mouse into the top-left corner → pyautogui raises
FailSafeExceptionand the action aborts instantly.Panic hotkey — Ctrl+Alt+Q (configurable) → hard-kills the server process (
os._exit(1)).STOP overlay — an always-on-top window (top-right) showing the current action, with a big red ■ STOP AGENT button that also hard-kills the process.
Lazy arm / stand-down: the overlay and the global panic hotkey are armed lazily on
the first action of a task, not at server startup — idle sessions show nothing and
grab no hotkeys. They stand down again when the agent calls action="stop", or
automatically after COMPUTER_USE_IDLE_STOP seconds (default 30) of inactivity. The
lightweight stdio process stays connected so the next task is instant. Everything
re-arms automatically on the next action.
Pacing also helps you stay in control: every action is followed by a configurable pause
(COMPUTER_USE_PAUSE) and the cursor glides rather than teleports
(COMPUTER_USE_MOVE_DURATION), so you can watch and interrupt.
Don't leave it unsupervised on anything that can spend money, send messages, or delete data.
Install
Requires Python 3.10 or 3.11 (3.13+ untested; avoid the 3.14 beta).
cd d:\Company\AIOBDCODE\computer-use-mcp
py -3.10 install.pyThis creates .venv/, installs the package + deps (editable), copies .env.example to
.env if missing, and registers the server in Claude Desktop's config (backing up any
existing config). Restart Claude Desktop, then look for the computer-use tool.
Claude Code (how this machine runs it)
Registered as a user-scope stdio server named realhands:
claude mcp add realhands --scope user -- `
"d:/Company/AIOBDCODE/computer-use-mcp/.venv/Scripts/python.exe" `
"d:/Company/AIOBDCODE/computer-use-mcp/src/server.py"The tool then appears as mcp__realhands__computer in every project.
Use
Just ask. For example:
Take a screenshot, open Chrome, go to YouTube, and search for "lofi".
Watch your real cursor move and your logged-in Chrome respond. Real-world proof: it has autonomously completed a full Google Play Console release flow in the user's own signed-in Chrome session.
Configuration (.env, optional — defaults are fine)
Var | Default | Meaning |
|
| Longest screenshot side sent to Claude (sweet spot for accuracy + token cost) |
|
| Default monitor (1 = primary, 2.. = others, 0 = all screens); overridable per call |
|
|
|
|
| Delay after each pyautogui action (interruptibility) |
|
| Global hard-stop hotkey |
|
| Show the STOP overlay window |
|
| Cursor glide time (human-like movement) |
|
| Auto stand-down after this many idle seconds (0 = never) |
Known gotchas
MCP connection drops when the agent idles between turns. The stdio connection to
realhandscan silently die while Claude is thinking/waiting between turns. Fix: issue ascreenshotaction — it silently reconnects. Importantly, an action that "failed" with Connection closed often still executed on the real machine — take a screenshot and check the actual screen state before retrying, or you may double-click / double-submit.Click coordinates must match the screenshot's monitor. If you screenshot
monitor=2and then click without passingmonitor=2, the click lands on the primary.activate_windowbeats the taskbar. Windows' foreground-lock makes taskbar clicks unreliable (the icon just flashes).activate_windowusesAttachThreadInput+ z-order toggling + a minimize/restore fallback, so prefer it for app switching.Don't run with Python 3.13/3.14. Tested on 3.10/3.11 only; the installer warns.
Typing long/Unicode text uses the clipboard. Your clipboard is saved and restored, but anything watching the clipboard will see the pasted text momentarily.
Self-test
.\.venv\Scripts\python src\screen.pyCaptures the screen, prints real vs. sent dimensions and the scale factor, writes
test_capture.png, and runs a coordinate round-trip check (center + both corners).
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/kanishka089/computer-use-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server