Provides tools for computer use automation on Linux systems, enabling screenshot capture, mouse control (clicking, dragging, scrolling), and keyboard input via X11.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Computer Use MCPtake a screenshot and tell me what you see on my desktop"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
computer-use-mcp
MCP server for computer use automation. Provides screenshot, click, keyboard, mouse, and drag-and-drop tools. Supports Linux (X11) and Windows (10/11).
System Dependencies
Linux
sudo apt install xdotool scrot x11-xserver-utilsxdotool — mouse/keyboard automation
scrot — screenshots
xrandr (from x11-xserver-utils) — screen resolution
Windows
No external dependencies required. Uses built-in PowerShell with .NET Framework:
PowerShell — comes pre-installed on Windows 10/11
user32.dll — native mouse/keyboard input via SendInput P/Invoke
System.Drawing — screenshot capture
System.Windows.Forms — screen resolution detection
Requirements:
Windows 10 or 11
.NET Framework (pre-installed)
Desktop session must be active (screen unlocked)
Setup
pnpm install
pnpm buildUsage
With Claude Code
Add to ~/.claude/settings.json (Linux) or %USERPROFILE%\.claude\settings.json (Windows):
{
"mcpServers": {
"computer-use": {
"command": "node",
"args": ["/path/to/computer-use-mcp/dist/index.js"]
}
}
}Development
pnpm devTools
Tool | Description |
| Capture full screen or a region. Returns optimized JPEG base64. |
| Click at (x, y) with left/right/middle button. |
| Double-click at (x, y). |
| Type text at current cursor position. |
| Press key combinations (e.g. |
| Move cursor without clicking. |
| Scroll up/down at a position. |
| Drag and drop from one position to another. |
| Get screen resolution. |
| Get current cursor position. |
| Wait N milliseconds between actions. |
Platform Details
Linux
Uses xdotool for mouse/keyboard, scrot for screenshots, xrandr for screen info
Requires X11 display (Wayland not supported)
Windows
Uses PowerShell with inline C# (Add-Type) calling user32.dll SendInput
~200-500ms overhead per operation due to PowerShell startup and Add-Type compilation
Captures primary monitor only
typeTextuses KEYEVENTF_UNICODE for full Unicode supportdelay_msparameter ontype_textis ignored (SendInput sends all chars at once)DPI-aware: calls SetProcessDPIAware before coordinate/screenshot operations
Notes
Screenshots are automatically resized if wider than 1920px and compressed to JPEG quality 80
A 100ms delay is applied after each action to avoid race conditions
All actions are logged to stderr in JSON format
Platform is auto-detected at startup via
process.platform
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.