Skip to main content
Glama
volkan-m
by volkan-m

VNC MCP Server

An advanced Model Context Protocol (MCP) server that empowers AI agents to see, reason, and control computers via the VNC (RFB) protocol. This server transforms a standard VNC connection into a high-level "Vision-Action" interface for AI Agents.

📊 System Architecture

graph TD
    subgraph "AI Environment"
        A[AI Agent / Claude Desktop] -->|MCP Protocol| B[VNC MCP Server]
    end

    subgraph "Bridge Layer"
        B -->|Image Processing| C[Sharp / Tesseract.js]
        B -->|RFB Protocol| D[VNC Connection]
    end

    subgraph "Target System"
        D -->|Input Simulation| E[Windows / macOS / Linux]
        E -->|Framebuffer| D
    end

    style B fill:#f00,stroke:#333,stroke-width:4px

Related MCP server: Wayland MCP Server

🌟 Key Features

đŸ‘ī¸ Advanced Vision

  • Regional Screenshot: Precise screen capture of specific areas to save tokens.

  • OCR (Optical Character Recognition): Extract text from any part of the screen using Tesseract.js.

  • Visual Search: Find specific icons, buttons, or images on the screen using template matching.

  • Pixel Analysis: Get exact Hex/RGB colors of any pixel for state verification.

  • Change Detection: Wait for a specific region to update before proceeding.

đŸ•šī¸ Human-Like Control

  • Precise Input: Single clicks, double clicks, and smooth mouse movement.

  • Drag & Drop: Simulate complex dragging operations between coordinates.

  • Scrolling: Directional mouse wheel support (Up/Down/Left/Right).

  • Keyboard Mastery: Type strings and perform complex key combinations (e.g., Ctrl+C, Cmd+Space).

⚡ Hybrid Control (VNC + SSH Shell)

  • Background Execution: Run any shell command (bash/cmd/powershell) on the server.

  • 🔗 Remote Execution via SSH: Run commands directly on the VNC host via integrated SSH client.

  • đŸ›Ąī¸ Security Layer: Robust filtering system with platform-specific allowlists and denylists.

  • Cross-Platform Support: Works on Windows, Linux, and macOS.

  • Seamless Integration: Use terminal to install software then use VNC to control apps.

🧠 Intelligence & Health

  • OS Detection & Platform Awareness: Intelligent OS detection based on screen resolution patterns (Windows/macOS/Linux).

  • Connection Health: Real-time connectivity monitoring with latency measurement and heartbeat status.

  • Clipboard Sync: Full bidirectional clipboard support.

  • Auto-Connect: Configure server to connect immediately via environment variables.

  • Advanced Sequencing: Execute multiple actions in a single call for faster automation.

🚀 Getting Started

Prerequisites

  • Node.js v18+

  • A running VNC server on the target system.

  • (Optional) SSH service enabled on the target system for remote command execution.

Quick Start

  1. Clone & Install:

    git clone https://github.com/volkan-m/vnc-mcp-server.git
    cd vnc-mcp-server
    npm install
    npm run build
  2. Test Environment (Docker with Chrome & SSH): Run a full Linux desktop with Google Chrome, VNC, and SSH enabled:

    docker run -d \
      -p 5901:5900 \
      -p 2222:22 \
      -v /dev/shm:/dev/shm \
      --name mcp-vnc-chrome \
      -e VNC_PASSWORD=vncpass \
      dorowu/ubuntu-desktop-lxde-vnc:latest

    Connect to VNC at localhost:5901 and SSH at localhost:2222 (port 22 inside).

  3. MCP Client Configuration:

Google Antigravity / Claude Desktop

Add this to your claude_desktop_config.json:

{
  "mcpServers": {
    "vnc": {
      "command": "node",
      "args": ["/absolute/path/to/vnc-mcp-server/dist/index.js"],
      "env": {
        "VNC_HOST": "localhost",
        "VNC_PORT": "5901",
        "VNC_PASSWORD": "vncpass"
      }
    }
  }
}

đŸ›Ąī¸ Security Configuration (Optional)

You can restrict which system commands the AI can execute:

Variable

Description

Default Denied

LINUX_DENIED_COMMANDS

Blocked Linux commands

`rm -rf,rm -f,shred,mkfs,:(){ :

WIN_DENIED_COMMANDS

Blocked Windows commands

del,format,rd,sfc,attrib

MAC_DENIED_COMMANDS

Blocked macOS commands

rm -rf,rm -f

[OS]_ALLOWED_COMMANDS

If set, ONLY these are allowed

(Empty by default)

🔗 Remote Execution via SSH

The server now supports direct command execution on the target machine via SSH. When connecting through vnc_connect, you can provide an optional ssh object:

{
  "host": "192.168.1.50",
  "ssh": {
    "user": "root",
    "password": "your-password",
    "port": 22
  }
}

If SSH is connected, vnc_system_command automatically routes all commands to the target machine instead of the local server.

📊 Enhanced Status Monitoring

Screen Info

vnc_get_screen_info returns detailed information about the remote session:

  • Resolution: Current display dimensions

  • OS Hint: Intelligent OS detection (Windows/macOS/Linux based on resolution patterns)

  • SSH Status: Whether SSH connection is active

Connection Health

vnc_get_connection_health provides comprehensive connectivity metrics:

  • VNC Status: Connected/Disconnected state

  • Latency: Real-time latency measurement from screen refresh operations

  • Resolution: Current display dimensions

  • SSH Status: SSH connection state

đŸŽ¯ Advanced Vision Features

Screenshot with Coordinate Overlay

Use the overlay parameter in vnc_screenshot to add a grid with labeled coordinates (A1, B2, etc.) for easier navigation:

{
  "format": "jpeg",
  "quality": 85,
  "overlay": true
}

This helps AI agents understand and communicate screen positions more accurately.

OCR with Multiple Languages

The vnc_ocr_region tool supports multiple languages via Tesseract.js. Specify the language code:

{
  "x": 0,
  "y": 0,
  "width": 1920,
  "height": 1080,
  "lang": "eng+deu+fra"
}

Supported languages: eng (English), deu (German), fra (French), spa (Spanish), and many more.

Batch Actions with vnc_execute_sequence

Execute multiple input actions in a single call for faster automation:

{
  "actions": [
    {"type": "move", "x": 100, "y": 100},
    {"type": "click", "x": 100, "y": 100},
    {"type": "wait", "ms": 500},
    {"type": "type", "text": "Hello World"},
    {"type": "key", "key": "Enter"}
  ]
}

This approach reduces RPC overhead and improves responsiveness for complex automation sequences.

🧰 API Reference (Tools)

Tool

Category

Description

vnc_connect

Setup

Connect to VNC (optional: include ssh configuration).

vnc_disconnect

Setup

Disconnect from VNC server.

vnc_get_screen_info

Status

Returns resolution, OS hint (Windows/macOS/Linux), and SSH status.

vnc_get_connection_health

Status

Provides heartbeat status, latency measurement, resolution, and SSH connection state.

vnc_screenshot

Vision

Captures PNG/JPEG of full screen or region with optional coordinate overlay.

vnc_ocr_region

Vision

Extracts text from a specific region with OCR (supports multiple languages).

vnc_find_image

Vision

Finds a template image on screen with configurable tolerance.

vnc_wait_for_change

Vision

Blocks until a screen region updates within timeout.

vnc_get_pixel_color

Vision

Returns Hex/RGB color value of a specific pixel.

vnc_mouse_move

Input

Moves mouse to coordinates.

vnc_mouse_click

Input

Performs mouse clicks (Left/Middle/Right buttons).

vnc_mouse_double_click

Input

Performs a quick double click.

vnc_mouse_drag

Input

Drags from point A to point B with button control.

vnc_mouse_scroll

Input

Directional wheel scrolling (Up/Down/Left/Right).

vnc_key_tap

Input

Taps keys with modifiers (e.g., C-v, M-Tab, S-Enter).

vnc_type_string

Input

Types a sequence of characters.

vnc_execute_sequence

Input

Executes multiple input actions in a single call for performance.

vnc_system_command

System

Execute shell commands (Local or Remote via SSH).

vnc_launch_gui_app

System

Launches GUI applications on the remote desktop.

vnc_set_clipboard

System

Sets remote system clipboard text.

vnc_get_clipboard

System

Retrieves last known remote clipboard content.

🛡 License

MIT License. See LICENSE for details.

A
license - permissive license
-
quality - not tested
D
maintenance

Maintenance

–Maintainers
–Response time
–Release cycle
–Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/volkan-m/vnc-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server