kwin-mcp
Automates KDE Plasma desktop environments on Wayland, providing tools for launching applications, mouse/keyboard/touch input, clipboard access, accessibility tree inspection, screenshot capture, and window management via isolated virtual sessions or live desktop connection.
Interacts with Wayland compositors (specifically KWin) for GUI automation, enabling input injection (mouse, keyboard, touch) via libei, virtual display management, and real-time desktop observation without affecting the host session.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@kwin-mcpStart a KWin session, launch kcalc, and calculate 2+3."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
kwin-mcp
Model Context Protocol server for Linux desktop GUI automation on KDE Plasma 6 Wayland
A Model Context Protocol (MCP) server that enables AI agents (Claude Code, Cursor, and other MCP clients) to launch, interact with, and observe any Wayland application in a fully isolated virtual KWin session -- without affecting the user's desktop. It also supports live desktop automation by connecting to an existing KWin session (real desktop or container) for collaborative workflows. With 30 MCP tools covering mouse, keyboard, touch, clipboard, accessibility tree inspection, screenshot capture, and window management, kwin-mcp provides everything needed for end-to-end GUI testing and desktop automation on Linux.
Table of Contents
Why kwin-mcp?
Isolated sessions -- Each session runs in its own
dbus-run-session+kwin_wayland --virtualsandbox. Your host desktop is never affected.Live session support -- Connect to a real KDE Plasma desktop or a KWin instance inside a container (e.g.
systemd-nspawn) for collaborative "share my screen" workflows.No screenshots required for interaction -- The AT-SPI2 accessibility tree gives the AI agent structured widget data (roles, names, coordinates, states, available actions), so it can interact with UI elements without relying solely on vision.
Zero authorization prompts -- Uses KWin's private EIS (Emulated Input Server) D-Bus interface directly, bypassing the XDG RemoteDesktop portal. No user confirmation dialogs.
Works with any Wayland app -- Anything that runs on KDE Plasma 6 Wayland works: Qt, GTK, Electron, and more. Input is injected via the standard
libeiprotocol.Full input coverage -- Mouse, keyboard, multi-touch, and clipboard -- all injected through the isolated session for complete desktop automation.
Use Cases
Automated GUI Testing
Run end-to-end GUI tests for KDE/Qt/GTK applications in headless isolated sessions. kwin-mcp launches each app in its own virtual KWin compositor, interacts via mouse, keyboard, and touch input, then verifies results through screenshots and the accessibility tree -- all without a physical display.
AI-Driven Desktop Automation
Let AI agents like Claude Code autonomously operate desktop applications. The agent reads the accessibility tree to understand the UI, performs actions through 30 MCP tools, and observes the results via screenshots -- creating a complete feedback loop for any Wayland application.
Live Desktop Collaboration
Connect to your real desktop session and let Claude observe and interact with what you see. Use session_connect or pass --default-live-session to make live mode the default. Also supports attaching to KWin running inside containers (e.g. systemd-nspawn) for isolated agent desktops.
Headless GUI Testing in CI/CD
Integrate Linux desktop GUI testing into CI/CD pipelines. kwin-mcp's virtual sessions require no X11 or physical display server, making it suitable for headless environments like GitHub Actions or GitLab CI runners on Linux.
Kiosk and Embedded Device Automation
Automate kiosk interfaces and embedded Linux desktops running KDE Plasma or a bare KWin Wayland compositor. Use session_start for isolated virtual testing of kiosk UIs, or session_connect to attach directly to a live kiosk or embedded device session for real-time automation and diagnostics.
Quick Start
Requires KDE Plasma 6 on Wayland. See System Requirements for details.
1. Install
# Using uv (recommended)
uv tool install kwin-mcp
# Or using pip
pip install kwin-mcp2. Configure Claude Code
Add to your project's .mcp.json:
{
"mcpServers": {
"kwin-mcp": {
"command": "uvx",
"args": ["kwin-mcp"]
}
}
}3. Use it
Ask Claude Code to launch and interact with any GUI application:
Start a KWin session, launch kcalc, and press the buttons to calculate 2 + 3.Claude Code will autonomously start an isolated session, launch the app, read the accessibility tree to find buttons, click them, and take a screenshot to verify the result.
Configuration
Recommended: install as a plugin
The fastest way to wire kwin-mcp into your editor is to install one of the bundled plugins. Each plugin auto-registers the MCP server and ships the kwin-desktop-automation skill, which teaches the agent which tool to call when (session-mode selection, the observe → act → verify loop, US-QWERTY vs Unicode typing, AT-SPI2 surface-local coordinates, and other platform pitfalls).
Claude Code — install the plugin from the marketplace:
/plugin marketplace add isac322/kwin-mcp
/plugin install kwin-mcp@kwin-mcpOpenCode — add the npm plugin to your opencode.json:
{
"$schema": "https://opencode.ai/config.json",
"plugin": ["@isac322/kwin-mcp-opencode"]
}For the full integration guide (manual fallback, customising the skill, troubleshooting), see docs/ai-agent-integration.md.
Claude Code
Add to your project's .mcp.json:
{
"mcpServers": {
"kwin-mcp": {
"command": "uvx",
"args": ["kwin-mcp"]
}
}
}Or if installed globally:
{
"mcpServers": {
"kwin-mcp": {
"command": "kwin-mcp"
}
}
}Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"kwin-mcp": {
"command": "uvx",
"args": ["kwin-mcp"]
}
}
}Running Directly
# As an installed script
kwin-mcp
# As a Python module
python -m kwin_mcp
# Interactive CLI (REPL for rapid testing)
kwin-mcp-cli
# Live session mode (default to real desktop instead of virtual)
kwin-mcp --default-live-session
kwin-mcp-cli --default-live-sessionAvailable Tools
Session Management (3 tools)
Tool | Parameters | Description |
|
| Start an isolated KWin Wayland session, optionally launching an app. Set |
|
| Connect to an existing KWin session (real desktop or container). Defaults to |
| (none) | Stop the session and clean up. For virtual sessions: terminates KWin and all apps. For live sessions: disconnects without killing KWin or pre-existing apps. |
Observation (3 tools)
Tool | Parameters | Description |
|
| Capture a screenshot of the virtual display (saved as PNG, returns file path) |
|
| Get the AT-SPI2 widget tree with roles, names, states, and coordinates. Use |
|
| Search for UI elements by name, role, or description (case-insensitive). Optionally filter by AT-SPI2 states (e.g. |
Mouse Input (6 tools)
Tool | Parameters | Description |
|
| Click at coordinates. Supports left/right/middle, single/double/triple click, modifier keys (e.g. |
|
| Move the cursor (hover) to coordinates without clicking |
|
| Scroll at coordinates. |
|
| Drag from one point to another with smooth interpolation. Supports custom |
|
| Press a mouse button at coordinates without releasing. Use with |
|
| Release a previously pressed mouse button at coordinates |
Keyboard Input (5 tools)
Tool | Parameters | Description |
|
| Type a string of text character by character (US QWERTY layout) |
|
| Type arbitrary Unicode text (Korean, CJK, etc.) via |
|
| Press a key or key combination (e.g., |
|
| Press and hold a key without releasing. Useful for holding modifiers across multiple actions (e.g., hold Ctrl while clicking items). |
|
| Release a previously held key |
Touch Input (4 tools)
Tool | Parameters | Description |
|
| Tap at coordinates. Use |
|
| Swipe from one point to another with configurable duration |
|
| Two-finger pinch gesture. |
|
| Multi-finger swipe gesture (2-5 fingers) for system gestures like workspace switching |
Clipboard (2 tools)
Tool | Parameters | Description |
| (none) | Read the current clipboard text content. Requires |
|
| Set the clipboard text content. Same requirements as |
Window Management (3 tools)
Tool | Parameters | Description |
|
| Launch an application inside the running session. Returns PID and log path. |
| (none) | List all accessible application windows with per-window titles and active/focused state markers via AT-SPI2 |
|
| Focus a window by application name (case-insensitive match) |
UI Polling (1 tool)
Tool | Parameters | Description |
|
| Poll the accessibility tree until an element matching the query and/or states appears or timeout expires. Use |
Advanced (3 tools)
Tool | Parameters | Description |
|
| Call any D-Bus method in the isolated session. Useful for controlling KWin scripting, app-specific D-Bus APIs, and system services. |
|
| Read stdout/stderr output of a launched app by PID. Set |
|
| List Wayland protocols available in the session. Useful for verifying protocol access (e.g., |
Frame capture: Many action tools accept an optional
screenshot_after_msparameter (e.g.,[0, 50, 100, 200, 500]) that captures screenshots at specified delays (in milliseconds) after the action completes. This is useful for observing transient UI states like hover effects, click animations, and menu transitions without extra MCP round-trips. Frame capture uses the fast KWin ScreenShot2 D-Bus interface (~30-70ms per frame).
How It Works
Claude Code / AI Agent
|
| MCP (stdio)
v
kwin-mcp server (30 tools) kwin-mcp-cli (interactive REPL)
| |
+--- both delegate to AutomationEngine (core.py) ---+
|
|-- session_start (virtual) ---> dbus-run-session
| |-- at-spi-bus-launcher
| +-- kwin_wayland --virtual
| +-- [your app]
|
|-- session_connect (live) ----> existing KWin (real desktop / container)
|
|-- screenshot ---------------> KWin ScreenShot2 D-Bus (spectacle fallback)
|
|-- accessibility_tree -------> AT-SPI2 (via PyGObject)
|-- find_ui_elements ---------> AT-SPI2 (via PyGObject)
|-- wait_for_element ----------> AT-SPI2 (polling)
|
|-- mouse_* ------------------> KWin EIS D-Bus --> libei
|-- keyboard_* ---------------> KWin EIS D-Bus --> libei
|-- touch_* ------------------> KWin EIS D-Bus --> libei
| +-- screenshot_after_ms -> KWin ScreenShot2 D-Bus (fast frame capture)
|
|-- keyboard_type_unicode ----> wtype / wl-copy + Ctrl+V
|-- clipboard_* --------------> wl-copy / wl-paste (wl-clipboard)
|
|-- launch_app / list_windows / focus_window
| |-- subprocess spawn
| +-- AT-SPI2 (via PyGObject)
|
|-- dbus_call -----------------> dbus-send (generic D-Bus)
|-- read_app_log --------------> log file read
+-- wayland_info --------------> wayland-infoTriple Isolation (+ Optional Home Isolation)
kwin-mcp provides three layers of isolation from the host desktop:
D-Bus isolation --
dbus-run-sessioncreates a private session bus. The isolated session's services (KWin, AT-SPI2, portals) are invisible to the host.Display isolation --
kwin_wayland --virtualcreates its own Wayland compositor with a virtual framebuffer. No windows appear on the host display.Input isolation -- Input events are injected through KWin's EIS interface into the isolated compositor only. The host desktop receives no input from kwin-mcp.
Home directory isolation (optional) -- When
isolate_home=trueis set insession_start, a temporary HOME directory is created with isolated XDG directories (XDG_CONFIG_HOME,XDG_DATA_HOME,XDG_CACHE_HOME,XDG_STATE_HOME). Apps in the session cannot read or modify host user settings (e.g.~/.config/kdeglobals), improving test reproducibility and safety.XDG_RUNTIME_DIRis intentionally not isolated because the Wayland socket resides there.
Input Injection
Mouse, keyboard, and touch events are injected through KWin's private org.kde.KWin.EIS.RemoteDesktop D-Bus interface. This returns a libei file descriptor that allows low-level input emulation without requiring the XDG RemoteDesktop portal (which would show a user authorization dialog). The connection uses:
Absolute pointer positioning for precise coordinate-based interaction
evdev keycodes with full US QWERTY mapping for keyboard input
Smooth drag interpolation (10+ intermediate steps) for realistic drag operations
EIS touch emulation for multi-touch gestures (tap, swipe, pinch, multi-finger swipe)
Screenshot Capture
The screenshot tool captures via the KWin org.kde.KWin.ScreenShot2 D-Bus interface (~30-70ms per frame), with spectacle CLI as a fallback. For action tools with the screenshot_after_ms parameter, the same D-Bus interface is used for fast burst capture. Raw ARGB pixel data is read from a pipe and converted to PNG using Pillow.
Accessibility Tree
The AT-SPI2 accessibility bus within the isolated session is queried via PyGObject (gi.repository.Atspi). This provides a structured tree of all UI widgets with their roles (button, text field, menu item, etc.), names, states (focused, enabled, visible, etc.), screen coordinates, and available actions (click, toggle, etc.).
System Requirements
Requirement | Details |
OS | Linux with KDE Plasma 6 (Wayland session) |
Python | 3.12 or later |
KWin |
|
libei | Usually bundled with KWin 6.x (EIS input emulation) |
spectacle | KDE screenshot tool (CLI mode) |
AT-SPI2 |
|
PyGObject | GObject introspection Python bindings |
D-Bus |
|
Optional dependencies:
Package | Required for |
|
|
|
|
|
|
Installing System Dependencies
sudo pacman -S kwin spectacle at-spi2-core python-gobject dbus-python-common
# Optional: for clipboard and Unicode input
sudo pacman -S wl-clipboard wtype wayland-utilssudo dnf install kwin-wayland spectacle at-spi2-core python3-gobject dbus-python
# Optional: for clipboard and Unicode input
sudo dnf install wl-clipboard wtype wayland-utilssudo zypper install kwin6 spectacle at-spi2-core python3-gobject python3-dbus-python
# Optional: for clipboard and Unicode input
sudo zypper install wl-clipboard wtype wayland-utilssudo apt install kwin-wayland spectacle at-spi2-core python3-gi gir1.2-atspi-2.0 python3-dbus
# Optional: for clipboard and Unicode input
sudo apt install wl-clipboard wtype wayland-utilsInstallation
Using uv (recommended)
uv tool install kwin-mcpUsing pip
pip install kwin-mcpFrom source
git clone https://github.com/isac322/kwin-mcp.git
cd kwin-mcp
uv sync
uv run kwin-mcpLimitations
US QWERTY keyboard layout only --
keyboard_typesupports US QWERTY only. For non-ASCII text (Korean, CJK, etc.), usekeyboard_type_unicode, which requireswtypeorwl-clipboardinstalled.KDE Plasma 6+ required -- Older KDE versions or other Wayland compositors (GNOME, Sway) are not supported.
AT-SPI2 availability varies -- Some applications may not fully expose their widget tree via AT-SPI2.
Touch input is EIS-emulated -- Touch events are emulated through KWin's EIS interface, not from a real touchscreen device. Most applications handle emulated touch correctly, but some may behave differently from physical touch.
Clipboard requires opt-in -- Clipboard tools (
clipboard_get,clipboard_set) are disabled by default becausewl-copycan hang in isolated sessions. Enable withenable_clipboard=trueinsession_start, and ensurewl-clipboardis installed.QMenu (native context menus) may not appear in AT-SPI2 -- Qt's AT-SPI2 bridge has incomplete support for popup menus on Wayland. Context menus may not be visible in
accessibility_treeorfind_ui_elements. Workaround: usescreenshotto visually locate menu items and click by coordinates.Screen edge triggers do not work with EIS input -- Auto-hide panels and layer-shell trigger strips rely on Wayland surface input routing, which may not respond to EIS-injected pointer events. Workaround: use
dbus_callwith KWin scripting or keyboard shortcuts instead.AT-SPI2 coordinates are surface-local, not screen-global -- Wayland clients do not know their global screen position (by design). Coordinates returned by
find_ui_elementsandaccessibility_treeare relative to the window's top-left corner, not the virtual screen. For single-window scenarios this is usually fine; for multi-window layouts, combine withscreenshotfor absolute positioning.
Contributing
Contributions are welcome! See CONTRIBUTING.md for development setup, code style guidelines, and the pull request process.
git clone https://github.com/isac322/kwin-mcp.git
cd kwin-mcp
uv sync
uv run ruff check src/
uv run ruff format --check src/
uv run ty check src/License
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/isac322/kwin-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server