Skip to main content
Glama

Claude KVM is an MCP tool that controls remote desktop environments over VNC. It consists of a thin JS proxy layer (MCP server) and a platform-native Swift VNC daemon running on your macOS system.

Claude KVM Demo Claude KVM Demo Mac

TIP

Phantom-WG could be a great alternative for you. Isolate your VNC server within your own network while enjoying self-hosted VPN performance with the extra privacy features you gain along the way.

Live Test Runs

NOTE

Tests are conducted transparently on GitHub Actions — each step is visible in the CI environment. At the end of every test, whether the integration passes or fails, you'll find screenshots of each step the agent took during the session, along with an.mp4 video recording that captures the entire session. By reviewing these recordings and screenshots, you can observe how the agent progressed through each stage, how long the task took, and what decisions were made based on the system prompt. You can use these examples as a reference when crafting your own system prompts or instructions for the MCP server in your own environment.

WARNING

Artifacts attached to these runs may have expired due to GitHub's artifact retention policy. Persistent copies are prepared via thePersist Artifacts workflow and can always be accessed by run ID from the artifacts/ directory on the press-kit branch.

Architecture

graph TB
    subgraph MCP["MCP Client (Claude)"]
        AI["Claude"]
    end

    subgraph Proxy["claude-kvm · MCP Proxy (stdio)"]
        direction TB
        Server["MCP Server<br/><code>index.js</code>"]
        Tools["Tool Definitions<br/><code>tools/index.js</code>"]
        Server --> Tools
    end

    subgraph Daemon["claude-kvm-daemon · Native VNC Client (stdin/stdout)"]
        direction TB
        CMD["Command Handler<br/><i>PC Dispatch</i>"]
        Scale["Display Scaling<br/><i>Scaled ↔ Native</i>"]

        subgraph Screen["Screen"]
            Capture["Frame Capture<br/><i>PNG · Crop · Diff</i>"]
            OCR["OCR Detection<br/><i>Apple Vision</i>"]
        end

        subgraph InputGroup["Input"]
            Mouse["Mouse<br/><i>Click · Drag · Move · Scroll</i>"]
            KB["Keyboard<br/><i>Tap · Combo · Type · Paste</i>"]
        end

        VNC["VNC Bridge<br/><i>LibVNCClient 0.9.15</i>"]

        CMD --> Scale
        Scale --> Capture
        Scale --> Mouse
        Scale --> KB
        Capture -.->|"framebuffer"| VNC
        Mouse -->|"pointer events"| VNC
        KB -->|"key events"| VNC
    end

    subgraph Target["Target Machine"]
        VNC_Server["VNC Server<br/><i>:5900</i>"]
        Desktop["Desktop Environment"]
        VNC_Server --> Desktop
    end

    AI <-->|"stdio<br/>JSON-RPC"| Server
    Server <-->|"stdin/stdout<br/>PC (NDJSON)"| CMD
    VNC <-->|"RFB Protocol<br/>TCP :5900"| VNC_Server

    classDef proxy fill:#1a1a2e,stroke:#16213e,color:#e5e5e5
    classDef daemon fill:#0f3460,stroke:#533483,color:#e5e5e5
    classDef target fill:#1a1a2e,stroke:#e94560,color:#e5e5e5

    class Server,Tools proxy
    class CMD,Scale,VNC,Capture,Mouse,KB daemon
    class VNC_Server,Desktop target

Layers

Layer

Language

Role

Communication

MCP Proxy

JavaScript (Node.js)

Communicates with Claude over MCP protocol, manages daemon lifecycle

stdio JSON-RPC

VNC Daemon

Swift/C (Apple Silicon)

VNC connection, screen capture, mouse/keyboard input injection

stdin/stdout PC (NDJSON)

PC (Procedure Call) Protocol

Communication between the proxy and daemon uses the PC protocol over NDJSON:

Request:      {"method":"<name>","params":{...},"id":<int|string>}
Response:     {"result":{...},"id":<int|string>}
Error:        {"error":{"code":<int>,"message":"..."},"id":<int|string>}
Notification: {"method":"<name>","params":{...}}

Coordinate Scaling

The VNC server's native resolution is scaled down to fit within --max-dimension (default: 1280px). Claude works more consistently with scaled coordinates — the daemon handles the conversion in the background:

Native:  4220 x 2568  (VNC server framebuffer)
Scaled:  1280 x 779   (what Claude sees and targets)

mouse_click(640, 400) → VNC receives (2110, 1284)

Screen Strategy

Claude minimizes token cost with a progressive verification approach:

diff_check       →  changeDetected: true/false     ~5ms    (text only, no image)
detect_elements  →  OCR text + bounding boxes      ~50ms   (text only, no image)
cursor_crop      →  crop around cursor              ~50ms   (small image)
screenshot       →  full screen capture             ~200ms  (full image)

detect_elements uses Apple Vision framework for on-device OCR. Returns text content with bounding box coordinates in scaled space — enables precise click targeting without consuming vision tokens.


Installation

Requirements

  • macOS (Apple Silicon / aarch64)

  • Node.js (LTS)

Daemon

brew tap ARAS-Workspace/tap
brew install claude-kvm-daemon
NOTE

claude-kvm-daemon is compiled and code-signed via CI (GitHub Actions). The build output is packaged in two formats: a .tar.gz archive for Homebrew distribution and a .dmg disk image for notarization. The DMG is submitted to Apple servers for notarization within the same workflow — the process can be tracked from CI logs. The notarized DMG is available as a CI Artifact; the archived .tar.gz is also published as a release on the repository. Homebrew installation tracks this release.

MCP Configuration

Create a .mcp.json file in your project directory:

{
  "mcpServers": {
    "claude-kvm": {
      "command": "npx",
      "args": ["-y", "claude-kvm"],
      "env": {
        "VNC_HOST": "192.168.1.100",
        "VNC_PORT": "5900",
        "VNC_USERNAME": "user",
        "VNC_PASSWORD": "pass",
        "CLAUDE_KVM_DAEMON_PATH": "/opt/homebrew/bin/claude-kvm-daemon",
        "CLAUDE_KVM_DAEMON_PARAMETERS": "-v"
      }
    }
  }
}
NOTE

The tool is end-to-end tested via CI — Claude executes tasks over VNC while an independent vision model observes and verifies the results. See theIntegration Test for live workflow runs, system prompts, and demo recordings.

Configuration

MCP Proxy (ENV)

Parameter

Default

Description

VNC_HOST

127.0.0.1

VNC server address

VNC_PORT

5900

VNC port number

VNC_USERNAME

Username (required for ARD)

VNC_PASSWORD

Password

CLAUDE_KVM_DAEMON_PATH

claude-kvm-daemon

Daemon binary path (not needed if already in PATH)

CLAUDE_KVM_DAEMON_PARAMETERS

Additional CLI arguments for the daemon

Daemon Parameters (CLI)

Additional arguments passed to the daemon via CLAUDE_KVM_DAEMON_PARAMETERS:

"CLAUDE_KVM_DAEMON_PARAMETERS": "--max-dimension 800 -v"

Parameter

Default

Description

--max-dimension

1280

Maximum display scaling dimension (px)

--connect-timeout

VNC connection timeout (seconds)

--bits-per-sample

Bits per pixel sample

--no-reconnect

Disable automatic reconnection

-v, --verbose

Verbose logging (stderr)

Runtime Configuration (PC)

All timing and display parameters are configurable at runtime via the configure method. Use get_timing to inspect current values.

Set timing:

{"method":"configure","params":{"click_hold_ms":80,"key_hold_ms":50}}
{"result":{"detail":"OK — changed: click_hold_ms, key_hold_ms"}}

Change display scaling:

{"method":"configure","params":{"max_dimension":960}}
{"result":{"detail":"OK — changed: max_dimension","scaledWidth":960,"scaledHeight":584}}

Reset to defaults:

{"method":"configure","params":{"reset":true}}
{"result":{"detail":"OK — reset to defaults","timing":{"click_hold_ms":50,"combo_mod_ms":10,"cursor_crop_radius":150,"double_click_gap_ms":50,"drag_min_steps":10,"drag_pixels_per_step":20,"drag_position_ms":30,"drag_press_ms":50,"drag_settle_ms":30,"drag_step_ms":5,"hover_settle_ms":400,"key_hold_ms":30,"max_dimension":1280,"paste_settle_ms":30,"scroll_press_ms":10,"scroll_tick_ms":20,"type_inter_key_ms":20,"type_key_ms":20,"type_shift_ms":10},"scaledWidth":1280,"scaledHeight":779}}

Get current values:

{"method":"get_timing"}
{"result":{"timing":{"click_hold_ms":80,"combo_mod_ms":10,"cursor_crop_radius":150,"double_click_gap_ms":50,"drag_min_steps":10,"drag_pixels_per_step":20,"drag_position_ms":30,"drag_press_ms":50,"drag_settle_ms":30,"drag_step_ms":5,"hover_settle_ms":400,"key_hold_ms":50,"max_dimension":1280,"paste_settle_ms":30,"scroll_press_ms":10,"scroll_tick_ms":20,"type_inter_key_ms":20,"type_key_ms":20,"type_shift_ms":10},"scaledWidth":1280,"scaledHeight":779}}

Parameter

Default

Description

max_dimension

1280

Max screenshot dimension

cursor_crop_radius

150

Cursor crop radius (px)

click_hold_ms

50

Click hold duration

double_click_gap_ms

50

Double-click gap delay

hover_settle_ms

400

Hover settle wait

drag_position_ms

30

Pre-drag position wait

drag_press_ms

50

Drag press hold threshold

drag_step_ms

5

Between interpolation pts

drag_settle_ms

30

Settle before release

drag_pixels_per_step

20

Point density per pixel

drag_min_steps

10

Min interpolation steps

scroll_press_ms

10

Scroll press-release gap

scroll_tick_ms

20

Inter-tick delay

key_hold_ms

30

Key hold duration

combo_mod_ms

10

Modifier settle delay

type_key_ms

20

Key hold during typing

type_inter_key_ms

20

Inter-character delay

type_shift_ms

10

Shift key settle

paste_settle_ms

30

Post-clipboard write wait


Tools

All operations are performed through a single vnc_command tool:

Screen

Action

Parameters

Description

screenshot

Full screen PNG capture

cursor_crop

Crop around cursor with crosshair overlay

diff_check

Detect screen changes against baseline

set_baseline

Save current screen as diff reference

Mouse

Action

Parameters

Description

mouse_click

x, y, button?

Click (left|right|middle)

mouse_double_click

x, y

Double click

mouse_move

x, y

Move cursor

hover

x, y

Move + settle wait

nudge

dx, dy

Relative cursor movement

mouse_drag

x, y, toX, toY

Drag from start to end

scroll

x, y, direction, amount?

Scroll (up|down|left|right)

Keyboard

Action

Parameters

Description

key_tap

key

Single key press (enter|escape|tab|space|...)

key_combo

key or keys

Modifier combo ("cmd+c" or ["cmd","shift","3"])

key_type

text

Type text character by character

paste

text

Paste text via clipboard

Detection

Action

Parameters

Description

detect_elements

OCR text detection with bounding boxes (Apple Vision)

Returns text elements with bounding box coordinates in scaled space:

{"method":"detect_elements"}
{"result":{"detail":"13 elements","elements":[{"confidence":1,"h":9,"text":"Finder","w":32,"x":37,"y":6},{"confidence":1,"h":9,"text":"File","w":15,"x":84,"y":6},{"confidence":1,"h":9,"text":"Edit","w":19,"x":112,"y":6},{"confidence":1,"h":9,"text":"View","w":22,"x":143,"y":6},{"confidence":1,"h":11,"text":"Go","w":15,"x":179,"y":6},{"confidence":1,"h":9,"text":"Window","w":35,"x":207,"y":6},{"confidence":1,"h":11,"text":"Help","w":22,"x":255,"y":6},{"confidence":1,"h":11,"text":"8•","w":26,"x":1161,"y":6},{"confidence":1,"h":9,"text":"Fri Feb 20 22:19","w":80,"x":1189,"y":6},{"confidence":1,"h":9,"text":"Assets","w":32,"x":1202,"y":97},{"confidence":1,"h":9,"text":"Passwords.kdbx","w":74,"x":1181,"y":168},{"confidence":1,"h":93,"text":"PHANTOM","w":633,"x":322,"y":477},{"confidence":1,"h":32,"text":"YOUR SERVER, YOUR NETWORK, YOUR PRIVACY","w":629,"x":325,"y":568}],"scaledHeight":717,"scaledWidth":1280}}

Configuration

Action

Parameters

Description

configure

{<params>}

Set timing/display params at runtime

configure

{reset: true}

Reset all params to defaults

get_timing

Get current timing + display params

Control

Action

Parameters

Description

wait

ms?

Wait (default 500ms)

health

Connection status + display info

shutdown

Graceful daemon shutdown


Authentication

Supported VNC authentication methods:

  • VNC Auth — password-based challenge-response (DES)

  • ARD — Apple Remote Desktop (Diffie-Hellman + AES-128-ECB)

macOS is auto-detected via the ARD auth type 30 credential request. When detected, Meta keys are remapped to Super (Command key compatibility).


MCP Badge

NOTE

Running on a bare-metal Mac? See theMac M1 Preparation Tricks for VNC hardening, SSH tunneling, and session stability tips.


"Claude" is a trademark of Anthropic, PBC. This project is not affiliated with or endorsed by Anthropic.

Copyright (c) 2026 Riza Emre ARAS — MIT License

Install Server
A
license - permissive license
D
quality
A
maintenance

Maintenance

Maintainers
19hResponse time
2dRelease cycle
2Releases (12mo)

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ARAS-Workspace/claude-kvm'

If you have feedback or need assistance with the MCP directory API, please join our Discord server