Provides tools for desktop automation on GNOME, including window management, workspace control, screenshot capture, and retrieval of monitor information.
Enables input injection and screen capture within Wayland-based GNOME sessions, bypassing protocol-level security restrictions through a dedicated shell extension.
Gnome-MCP
Desktop automation for GNOME Wayland via MCP. Take screenshots, manage windows, and inject keyboard/mouse input from AI assistants like Claude Code.
Claude Code ──MCP──▶ gnome-desktop-mcp (Python) ──D-Bus──▶ GNOME Shell ExtensionWhy
GNOME Wayland blocks external processes from taking screenshots or injecting input. This extension runs inside the compositor, bypassing those restrictions, and exposes a D-Bus API. The MCP server bridges that API to any MCP-compatible client.
Features
30 MCP tools: screenshots, window management, input injection, workspace control
Privacy indicator: top bar icon shows connection status (red = active, grey = idle)
Consent dialog: first-use confirmation before enabling automation
Access gating: master kill switch to disable all automation instantly
Requirements
GNOME Shell 45-49 (Wayland)
Python 3.12+
Installation
Quick install (development)
git clone https://github.com/sbuysse/gnome-mcp.git
cd gnome-mcp
./install.shThen log out and back in (required for Wayland), and enable:
gnome-extensions enable desktop-automation@gnomemcp.github.ioMCP server only (from PyPI)
pip install gnome-desktop-mcpClaude Code Configuration
Add to ~/.claude/settings.json:
{
"mcpServers": {
"desktop-automation": {
"command": "gnome-desktop-mcp"
}
}
}Tools
Screenshots
Tool | Description |
| Full screen capture |
| Capture a specific window |
| Capture a rectangular region |
| Get pixel color at coordinates |
| Remove temp screenshot files |
Windows
Tool | Description |
| List all open windows |
| Get detailed window properties |
| Focus and raise a window |
| Move and resize a window |
| Minimize/restore |
| Maximize/restore |
| Close a window |
| List all workspaces |
| Switch workspace |
Input
Tool | Description |
| Press a single key ("Return", "F5", "a") |
| Key combination ("Ctrl+Alt+t") |
| Type text character by character |
| Move mouse to coordinates |
| Click at coordinates |
| Double-click |
| Press/release mouse button |
| Drag from point A to point B |
| Scroll at coordinates |
Utility
Tool | Description |
| Check extension is alive |
| Check/toggle automation |
| List monitors with geometry |
Privacy
Top bar indicator shows when automation is active
Toggle switch to disable all automation instantly
Activity log tracks last 20 method calls (name + timestamp only, no data)
D-Bus access gating: all methods blocked when disabled
Session bus trust model: any local user process can call the API (consistent with GNOME's security model)
Architecture
The GNOME Shell extension (desktop-automation@gnomemcp.github.io) runs inside the Wayland compositor. It exports io.github.gnomemcp.DesktopAutomation on the session D-Bus with privileged access to:
Shell.Screenshot— silent screenshots (no permission dialog)Meta.Window— window managementClutter.VirtualInputDevice— keyboard/mouse injection
The Python MCP server (gnome-desktop-mcp) translates MCP tool calls into D-Bus method calls via dasbus.
Development
# Install in development mode
pip install -e mcp-server[dev]
# Run tests
python -m pytest tests/ -v
# Watch extension logs
journalctl /usr/bin/gnome-shell -f
# Test D-Bus directly
gdbus call --session --dest org.gnome.Shell \
--object-path /io/github/gnomemcp/DesktopAutomation \
--method io.github.gnomemcp.DesktopAutomation.PingLicense
GPL-3.0