macos-desktop-control
Provides tools for controlling Android emulators via adb including device listing, screenshot, tap, swipe, text input, key events, opening URLs, and installing APKs.
Provides tools for controlling iOS simulators including device listing, boot/shutdown, screenshot, tap, swipe, text input, opening URLs, and installing apps.
Required for iOS simulator control; the server uses Xcode's simctl command to manage and interact with iOS simulators.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@macos-desktop-controlList all open windows"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
macos-desktop-control
MCP server for native macOS desktop automation — screen, mouse, keyboard, window management, and mobile simulators.
No Docker. No virtual display. Controls your actual Mac desktop. AI operates in the background or the foreground — you choose.
What's New in v3.1
Smart screenshot compression — screenshots are now compressed by default to prevent API "Input too long" errors on high-DPI displays (Retina, 4K).
Preset | Max Width | Quality | Format | Typical Size |
| original | 100 | PNG | 4-15 MB |
| 2048 px | 85 | JPEG | 300-500 KB |
| 1280 px | 70 | JPEG | 100-400 KB |
| 800 px | 50 | JPEG | 30-150 KB |
Default is medium. Agent picks the level based on the task — or uses none for pixel-perfect work.
Tile mode — when full resolution is needed, split a screenshot into a grid. Agent fetches tiles one at a time, each small enough for the API.
New tool: screenshot_tile — fetch individual tiles from a tiled screenshot.
Compression also works on sim_screenshot and emu_screenshot.
v3.0
Two operation modes. 30 tools (up from 13). Optional iOS/Android simulator control.
Mode | How It Works | User Experience |
Foreground | cliclick + AppleScript (same as v2) | You watch the AI operate your screen |
Background | CGEvent API via | AI works in a target window — your focus stays untouched |
Add target: { app: "Safari" } to any supported tool. Coordinates become window-relative. The AI never steals your foreground.
Related MCP server: macos-computer-use-mcp
Quick Start
# 1. Install cliclick
brew install cliclick
# 2. Clone and install
git clone https://github.com/d-wwei/macos-desktop-control.git
cd macos-desktop-control
npm install
# 3. Add to your MCP client (example: Claude Code)
claude mcp add macos-desktop-control -- node /path/to/macos-desktop-control/src/index.jsGrant Accessibility permission to your terminal: System Settings → Privacy & Security → Accessibility.
Features
Foreground Mode (default)
All original v2 capabilities, unchanged.
Screen capture — full screen, region, or specific display; with compression presets and tile mode
Mouse — click (left/right/double/triple), move, drag, scroll, with modifier keys
Keyboard — three typing modes (keystroke, cliclick, direct IME bypass), any key combo via AppleScript key codes
Window management — list windows, focus by app/title, open apps
System — run macOS Shortcuts workflows
Focus protection —
appparameter auto-refocuses the target before each action
Background Mode (target parameter)
Add target: { app: "AppName", title?: "WindowTitle" } to operate without stealing focus.
Tool | Background Behavior |
| Captures the target window via |
| Sends CGEvent mouse events directly to the target PID |
| Pastes text via CGEvent Cmd+V to the target PID (saves/restores clipboard) |
| Sends CGEvent keyboard events to the target PID |
| Sends CGEvent scroll wheel events to the target PID |
| Flash technique: briefly activates target → drags → restores your foreground app |
| Launches via |
| Returns CGWindowID + PID for each window (used internally for targeting) |
When target is set, x/y coordinates are window-relative — (0,0) is the top-left corner of the target window. The server converts to screen-absolute coordinates internally.
iOS Simulator (requires Xcode)
Tools register automatically when xcrun simctl is detected.
Tool | Function |
| List simulators and their status |
| Start or stop a simulator |
| Capture at native device resolution |
| Tap at iOS-space coordinates (auto-mapped to Simulator window) |
| Swipe gesture with duration control |
| Type text into the simulator |
| Open a URL on the simulator |
| Install a .app bundle |
Android Emulator (requires adb)
Tools register automatically when adb is detected. All operations are fully background — adb never steals focus.
Tool | Function |
| List connected devices/emulators |
| Capture via |
| Tap at device coordinates |
| Swipe with duration control |
| Type text |
| Send keyevent (HOME, BACK, ENTER, etc.) |
| Open a URL via intent |
| Install an APK |
Usage Examples
Background screenshot of a specific app
{ "target": { "app": "Safari" } }Captures Safari's window even if it's behind other windows. Your foreground stays untouched.
Background click in a window
{ "x": 100, "y": 200, "target": { "app": "Safari", "title": "GitHub" } }Clicks at position (100, 200) relative to the Safari window titled "GitHub". No focus change.
Background text input
{ "text": "hello world", "target": { "app": "Notes" } }Types into Notes via clipboard paste (CGEvent Cmd+V). Clipboard is saved and restored.
Compressed screenshot (default behavior in v3.1)
{ "target": { "app": "Chrome" } }Returns a 1280px-wide JPEG (~150KB) instead of a raw PNG (~5MB). Works out of the box.
High-res screenshot with no compression
{ "target": { "app": "Chrome" }, "compression": "none" }Returns the raw PNG — same as v3.0 behavior.
Custom compression
{ "target": { "app": "Chrome" }, "compression": "low", "maxWidth": 1920, "quality": 90 }Explicit maxWidth/quality/format override the preset values.
Tile mode for full-resolution inspection
{ "target": { "app": "Chrome" }, "tile": { "rows": 2, "cols": 2 } }Returns a manifest with tile metadata. Then fetch individual tiles:
{ "id": "tiles-1711929600000-abc123", "index": 0, "compression": "medium" }Focus-safe foreground operation
{ "text": "hello", "app": "TextEdit", "mode": "direct" }Writes text directly via AppleScript — bypasses input method entirely.
Prerequisites
macOS (tested on Sequoia 15.x and Tahoe 26.x)
Node.js 18+
cliclick:
brew install cliclickAccessibility permission for your terminal app
Optional: Xcode (for iOS simulator tools)
Optional: Android SDK with adb (for Android emulator tools)
Client Configuration
Uses stdio transport. Configuration is the same across all MCP clients.
# Project scope
claude mcp add macos-desktop-control -- node /path/to/macos-desktop-control/src/index.js
# Global scope
claude mcp add macos-desktop-control -s user -- node /path/to/macos-desktop-control/src/index.js~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"macos-desktop-control": {
"command": "node",
"args": ["/path/to/macos-desktop-control/src/index.js"]
}
}
}.codex/mcp.json:
{
"mcpServers": {
"macos-desktop-control": {
"command": "node",
"args": ["/path/to/macos-desktop-control/src/index.js"]
}
}
}~/.gemini/settings.json:
{
"mcpServers": {
"macos-desktop-control": {
"command": "node",
"args": ["/path/to/macos-desktop-control/src/index.js"]
}
}
}.cursor/mcp.json:
{
"mcpServers": {
"macos-desktop-control": {
"command": "node",
"args": ["/path/to/macos-desktop-control/src/index.js"]
}
}
}.vscode/mcp.json:
{
"servers": {
"macos-desktop-control": {
"command": "node",
"args": ["/path/to/macos-desktop-control/src/index.js"]
}
}
}Architecture
┌─────────────────────────────────┐
│ MCP Server (stdio transport) │
└──────────┬──────────────────────┘
│
┌──────────────────────┼──────────────────────┐
│ │ │
Foreground Mode Background Mode Simulator Mode
│ │ │
┌──────────┴──────────┐ ┌───────┴────────┐ ┌────────┴────────┐
│ cliclick (mouse) │ │ CGEvent API │ │ xcrun simctl │
│ osascript (keyboard)│ │ via JXA bridge │ │ (iOS) │
│ screencapture │ │ CGEventPost- │ │ │
│ shortcuts CLI │ │ ToPid(pid) │ │ adb │
└─────────────────────┘ │ screencapture │ │ (Android) │
│ -l<windowId> │ └─────────────────┘
└────────────────┘Background mode internals:
CGWindowListCopyWindowInfovia JXA enumerates windows with CGWindowID, PID, and boundsWindow-relative coordinates are converted to screen-absolute using bounds
CGEventPostToPidsends mouse/keyboard/scroll events directly to the target processscreencapture -l<windowId>captures a specific window without requiring focus
Compared to Alternatives
Solution | Platform | Background Mode | Simulator Support | Real Desktop |
This project | macOS | Yes (CGEvent) | iOS + Android | Yes |
Anthropic Computer Use | Linux | No | No | No (virtual) |
MCPControl | Windows | No | No | Yes |
Playwright MCP | Cross-platform | Partial | No | Browser only |
PyAutoGUI MCP servers | Cross-platform | No | No | Yes |
Why macOS-native
Background operation — CGEvent API posts events to a target PID without touching focus. PyAutoGUI and cliclick both require the window to be foreground.
Focus-stealing prevention —
appparameter +ensureAppFocus()handles the approval-dialog problem that all MCP clients share.IME bypass —
directmode writes text through AppleScript, skipping the input method entirely. PyAutoGUI'stypewriteonly handles ASCII.Simulator integration — iOS and Android simulators controlled through the same MCP interface. No separate tools needed.
Lightweight — cliclick (one brew package) + built-in macOS tools. No Python runtime, no ONNX, no heavy dependencies.
When to choose a cross-platform solution
You need Windows or Linux support
You need OCR-based element detection
Background operation is not a requirement for your workflow
Update Management
This project integrates update-kit for update orchestration with policy control, verification, and rollback.
Check for updates:
npx update-kit check --cwd /path/to/macos-desktop-control --jsonApply an update (git pull + syntax verification):
npx update-kit apply --cwd /path/to/macos-desktop-controlRollback if something goes wrong:
npx update-kit rollback --cwd /path/to/macos-desktop-controlConfiguration lives in update.config.json. State and audit logs are stored in .update-kit/ (gitignored).
License
MIT
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/d-wwei/macos-desktop-control'
If you have feedback or need assistance with the MCP directory API, please join our Discord server