Which integrations are available for this server?

Integrates with the GNOME desktop environment to facilitate automated UI interactions and screen state monitoring using native tools and protocols. Provides programmatic access to Linux desktops, allowing AI assistants to perform visual tasks such as clicking, typing, and capturing screen content. Enables AI models to control Ubuntu desktop environments through screenshot analysis, UI element detection via AT-SPI, and simulated mouse and keyboard input.

How do I use Ubuntu Desktop Control MCP?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Ubuntu Desktop Control MCP Open the terminal and run system updates" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

Ubuntu Desktop Control MCP Server

An MCP (Model Context Protocol) server that enables LLMs to control your Ubuntu desktop by taking screenshots and sending mouse clicks. This allows AI assistants to visually interact with your desktop applications.

⚡ NEW: Optimized Production Workflow

5x faster, 5x more accurate! Now using the same optimization techniques as Anthropic's Computer Use API:

📸 Smart Screenshots: Auto-downsampled to 1280x720 (5x smaller)
🎯 Numbered Elements: See what's clickable at a glance with overlaid IDs
🤖 AT-SPI Integration: Automatic UI element detection using accessibility API
📐 Percentage Coords: Resolution-agnostic positioning (no more pixel hunting!)
⚡ Workflow Batching: Execute multiple actions in one MCP call
🎪 Element Cache: Direct element interaction - "click element #5"

Example - Old way (8+ calls, ~15s):

take_screenshot() → analyze → grid overlay → zoom quadrant → find pixel → click → miss

Example - New way (1 call, ~3s):

take_screenshot() → "I see Pinta is element #5" → click_screen(element_id=5) → ✓

See README.md for full details.

Features

📸 Screenshot Capture: Annotated screenshots with automatic element detection
🔢 Element Detection: AT-SPI + CV fallback for robust UI element identification
🖱️ Smart Clicking: Click by element ID or percentage coordinates
⌨️ Keyboard Control: Type text and press keys/hotkeys
🎯 Mouse Movement: Smooth cursor positioning with animation
🚀 Workflow Batching: Execute multi-step tasks in single MCP call
📊 Diagnostics: Display scaling detection, warnings, and recommendations

Quick Start

1. Prerequisites

Ubuntu Linux (X11 required, Wayland not fully supported)
Python 3.9+

2. Installation

From PyPI (Recommended)

pip install ubuntu-desktop-control

From Source

# Clone repository git clone https://github.com/charettep/ubuntu-desktop-control-mcp.git cd ubuntu-desktop-control-mcp # Install system dependencies (requires sudo) chmod +x scripts/install.sh ./scripts/install.sh # Install Python dependencies pip install -e .

Configuration

Claude Code

Method 1: CLI (Recommended)

claude mcp add --transport stdio ubuntu-desktop-control -- \ ubuntu-desktop-control

Method 2: Manual Config

Edit ~/.claude/claude_desktop_config.json:

{ "mcpServers": { "ubuntu-desktop-control": { "command": "ubuntu-desktop-control", "args": [] } } }

VS Code Insiders

Method 1: MCP Command

Open Command Palette (Ctrl+Shift+P)
Run MCP: Open Workspace Folder Configuration
Add the server configuration below.

Method 2: Manual Config

Create .vscode/mcp.json in your workspace:

{ "servers": { "ubuntu-desktop-control": { "type": "stdio", "command": "ubuntu-desktop-control", "args": [] } } }

Codex CLI

Method 1: CLI

codex mcp add ubuntu-desktop-control -- \ ubuntu-desktop-control

Method 2: Manual Config

Edit ~/.config/codex/config.toml:

[mcp_servers.ubuntu-desktop-control] type = "stdio" command = "ubuntu-desktop-control" args = []

Tools

Core Capabilities

Tool	Description
`take_screenshot`	Capture the desktop (optionally per-monitor) with annotated elements.
`click_screen`	Click by element ID or percentage coordinates (supports per-monitor).
`move_mouse`	Move the cursor by element ID or percentage coordinates (supports per-monitor).
`drag_mouse`	Drag the cursor to coordinates while holding a mouse button.
`type_text`	Type text using the keyboard.
`press_key`	Press a specific key (e.g., 'enter', 'esc').
`press_hotkey`	Press a combination of keys simultaneously (e.g., Ctrl+Shift+C).
`get_screen_info`	Get screen dimensions and display server type (X11/Wayland).
`get_display_diagnostics`	Troubleshoot scaling and coordinate mismatches.
`map_GUI_elements_location`	Detect and map UI elements (hitboxes) using Computer Vision.
`convert_screenshot_coordinates`	Convert pixels from a screenshot to logical click coordinates.
`list_prompt_templates`	List available prompt templates (for clients without native prompt support).
`execute_workflow`	Execute a batch of actions (screenshot/click/move/type/wait).

Prompt Rendering Tools

These tools allow clients without native prompt support (like Codex CLI) to render prompt templates as text.

Tool	Description
`render_prompt_baseline_display_check`	Render the baseline display check prompt.
`render_prompt_capture_full_desktop`	Render the full desktop capture prompt.
`render_prompt_capture_region_for_task`	Render the region capture prompt.
`render_prompt_convert_screenshot_coordinates`	Render the coordinate conversion prompt.
`render_prompt_safe_click`	Render the safe click prompt.
`render_prompt_hover_and_capture`	Render the hover and capture prompt.
`render_prompt_coordinate_mismatch_recovery`	Render the mismatch recovery prompt.
`render_prompt_end_to_end_capture_and_act`	Render the end-to-end workflow prompt.

Prompts

Prompt	Description
`baseline_display_check`	Check display settings and scaling before starting tasks.
`capture_full_desktop`	Capture and summarize the full desktop state.
`capture_region_for_task`	Capture a specific region for detailed inspection.
`safe_click`	Perform a click with safety checks and scaling awareness.
`hover_and_capture`	Hover to reveal UI elements, then capture.
`coordinate_mismatch_recovery`	Diagnose and fix missed clicks.
`end_to_end_capture_and_act`	Plan and execute a full interaction loop.

Configuration & Customization

Environment Variables

The server relies on standard Linux/X11 environment variables to locate and interact with the desktop session.

Variable	Description	Default
`DISPLAY`	X11 display identifier. Required for the server to know which screen to control.	`:0`
`XDG_SESSION_TYPE`	Used to detect if running on X11 or Wayland.	`unknown`
`XAUTHORITY`	Path to X11 authority file. Required if running from a different user context (e.g., sudo, docker) or over SSH.	`~/.Xauthority`
`UDC_FORCE_COORDS`	Force coordinate clicks (disable AT-SPI action clicks).	unset

Passing Environment Variables

You can customize these variables in your MCP client configuration.

Claude Desktop (`claude_desktop_config.json`)

{ "mcpServers": { "ubuntu-desktop-control": { "command": "ubuntu-desktop-control", "args": [], "env": { "DISPLAY": ":0", "XAUTHORITY": "/home/user/.Xauthority" } } } }

VS Code (`.vscode/mcp.json`)

{ "servers": { "ubuntu-desktop-control": { "command": "ubuntu-desktop-control", "args": [], "env": { "DISPLAY": ":0" } } } }

Display Scaling & Coordinates

If clicks land in the wrong place, you likely have a HiDPI display scaling mismatch (e.g., logical 1920x1080 vs physical 3840x2160).

Solutions:

Auto-scale: Use click_screen(..., auto_scale=True) to let the server handle it.
Diagnostics: Run get_display_diagnostics() to see the scaling factor.
Element IDs: Use take_screenshot(detect_elements=True) and click via element_id or percentage coordinates.

Troubleshooting

"Screenshot failed": Ensure gnome-screenshot or scrot is installed (sudo apt install gnome-screenshot).
"PyAutoGUI not installed": Ensure you are using the .venv python.
Wayland Issues: This server requires X11. Check with echo $XDG_SESSION_TYPE. If "wayland", switch to "GNOME on Xorg" at login.
Permission Denied: Run xhost +local: if you have X11 permission issues.

Security

⚠️ Warning: This server gives LLMs full control over your mouse and visibility of your screen.

Only use with trusted clients.
Be aware screenshots may capture sensitive data.
Automated clicks can be destructive.

License

MIT License

This server cannot be installed

-

security - not tested

A

license - permissive license

-

quality - not tested

How are these scores calculated?

Resources

GitHub Repository

Need Help?

Report Issue

Related Servers

Appeared in Searches

Information about Ubuntu

Ubuntu Desktop Control MCP

Ubuntu Desktop Control MCP Server

⚡ NEW: Optimized Production Workflow

Features

Quick Start

1. Prerequisites

2. Installation

From PyPI (Recommended)

From Source

Configuration

Claude Code

Method 1: CLI (Recommended)

Method 2: Manual Config

VS Code Insiders

Method 1: MCP Command

Method 2: Manual Config

Codex CLI

Method 1: CLI

Method 2: Manual Config

Tools

Core Capabilities

Prompt Rendering Tools

Prompts

Configuration & Customization

Environment Variables

Passing Environment Variables

Claude Desktop (`claude_desktop_config.json`)

VS Code (`.vscode/mcp.json`)

Display Scaling & Coordinates

Troubleshooting

Security

License

Resources

Appeared in Searches

New MCP Servers

Latest Blog Posts

MCP directory API

Ubuntu Desktop Control MCP Server

⚡ NEW: Optimized Production Workflow

Features

Quick Start

1. Prerequisites

2. Installation

From PyPI (Recommended)

From Source

Configuration

Claude Code

Method 1: CLI (Recommended)

Method 2: Manual Config

VS Code Insiders

Method 1: MCP Command

Method 2: Manual Config

Codex CLI

Method 1: CLI

Method 2: Manual Config

Tools

Core Capabilities

Prompt Rendering Tools

Prompts

Configuration & Customization

Environment Variables

Passing Environment Variables

Claude Desktop (claude_desktop_config.json)

VS Code (.vscode/mcp.json)

Display Scaling & Coordinates

Troubleshooting

Security

License

Resources

Appeared in Searches

New MCP Servers

Latest Blog Posts

MCP directory API

Claude Desktop (`claude_desktop_config.json`)

VS Code (`.vscode/mcp.json`)