Skip to main content
Glama

Electron Native MCP Server

by aj47
IMPLEMENTATION.md7.65 kB
# Implementation Summary ## Overview Successfully implemented a complete Model Context Protocol (MCP) server for debugging and automating native Electron applications on macOS. ## What Was Built ### Core Components 1. **MCP Server** (`src/server.ts`, `src/index.ts`) - Built using `@modelcontextprotocol/sdk` v1.21.0 - Stdio transport for communication - 24 registered tools across 4 categories 2. **CDP Client** (`src/lib/cdp-client.ts`) - Chrome DevTools Protocol client using `chrome-remote-interface` - Manages connections to Electron apps - Supports DOM inspection, JavaScript execution, screenshots 3. **Accessibility Manager** (`src/lib/accessibility.ts`) - macOS UI automation using `robotjs` - Mouse control (move, click, drag) - Keyboard control (type, press keys) - Screenshot capture 4. **Permissions Manager** (`src/lib/permissions.ts`) - Checks and manages macOS system permissions - Uses `node-mac-permissions` - Provides user-friendly setup instructions ### Tool Categories #### 1. DOM Inspection Tools (7 tools) - `list_electron_targets` - Discover Electron windows - `connect_to_electron_target` - Establish CDP connection - `get_dom_tree` - Retrieve DOM structure - `query_selector` - Find elements by CSS selector - `query_selector_all` - Find all matching elements - `execute_javascript` - Run code in Electron context - `take_electron_screenshot` - Capture Electron window #### 2. UI Automation Tools (9 tools) - `get_mouse_position` - Get cursor position - `move_mouse` - Move cursor - `click` - Click (left/right/middle) - `double_click` - Double click - `drag` - Drag and drop - `type_text` - Type text - `press_key` - Press keys with modifiers - `take_screenshot` - Capture screen/region - `get_screen_size` - Get display dimensions #### 3. Global Hotkey Tools (4 tools) - `trigger_hotkey` - Custom keyboard shortcuts - `trigger_common_macos_hotkey` - System hotkeys (Spotlight, Mission Control, etc.) - `simulate_app_shortcut` - App-specific shortcuts - `send_key_sequence` - Complex key sequences #### 4. Permission Tools (4 tools) - `check_permission` - Check single permission - `check_all_permissions` - Check all permissions - `get_permission_instructions` - Get setup guide - `request_permission` - Request permission ### Example Electron App Created a test application (`examples/test-electron-app/`) with: - Interactive UI elements (buttons, inputs) - Event logging - Styled interface - Pre-configured for debugging (`--inspect=9222`) ## Technical Stack ### Dependencies **Core:** - `@modelcontextprotocol/sdk` ^1.21.0 - MCP protocol implementation - `zod` ^3.23.8 - Schema validation **Electron Debugging:** - `chrome-remote-interface` ^0.33.2 - CDP client **macOS Automation:** - `robotjs` ^0.6.0 - Native UI automation - `node-mac-permissions` ^2.3.0 - Permission management **Development:** - `typescript` ^5.7.2 - `tsx` ^4.19.2 - `eslint` ^8.57.1 - `prettier` ^3.4.2 ### Architecture ``` electron-native-mcp/ ├── src/ │ ├── index.ts # Entry point (stdio transport) │ ├── server.ts # MCP server configuration │ ├── types/ │ │ └── index.ts # TypeScript type definitions │ ├── lib/ │ │ ├── cdp-client.ts # Chrome DevTools Protocol client │ │ ├── accessibility.ts # macOS UI automation │ │ └── permissions.ts # Permission management │ └── tools/ │ ├── dom/ # DOM inspection tools │ ├── ui/ # UI automation tools │ ├── hotkey/ # Hotkey tools │ └── permissions/ # Permission tools ├── examples/ │ └── test-electron-app/ # Test Electron application ├── dist/ # Compiled JavaScript ├── package.json ├── tsconfig.json ├── README.md ├── USAGE.md └── IMPLEMENTATION.md ``` ## Key Features ### 1. DOM Inspection via CDP - Connect to any Electron app running with `--inspect` flag - Query DOM using CSS selectors - Execute arbitrary JavaScript - Take screenshots of web content ### 2. Native UI Automation - Control mouse and keyboard - Works with any macOS application - Pixel-perfect positioning - Support for all mouse buttons and keyboard modifiers ### 3. Global Hotkeys - Trigger system-wide keyboard shortcuts - Pre-configured common macOS hotkeys - Custom key combinations - Complex key sequences ### 4. Permission Management - Check permission status - Request permissions programmatically - User-friendly setup instructions - Supports all required macOS permissions ## Implementation Challenges & Solutions ### Challenge 1: nut.js Unavailability **Problem:** nut.js is no longer publicly available on npm **Solution:** Switched to robotjs for UI automation ### Challenge 2: MCP SDK API Changes **Problem:** Tool registration API differs from documentation **Solution:** Used `registerTool()` method with proper schema structure ### Challenge 3: TypeScript Type Compatibility **Problem:** ToolResult type didn't match MCP SDK expectations **Solution:** Added index signature `[x: string]: unknown` to ToolResult interface ### Challenge 4: Permission Management **Problem:** macOS requires multiple system permissions **Solution:** Created comprehensive permission manager with instructions ## Testing ### Manual Testing Steps 1. **Build the server:** ```bash npm install npm run build ``` 2. **Run the test Electron app:** ```bash cd examples/test-electron-app npm install npm start ``` 3. **Test DOM inspection:** - List targets - Connect to test app - Query selectors - Execute JavaScript 4. **Test UI automation:** - Check permissions - Move mouse - Click buttons - Type text 5. **Test hotkeys:** - Trigger Spotlight - Trigger custom shortcuts ## Future Enhancements ### Potential Improvements 1. **Enhanced Image Recognition** - Add OCR capabilities - Template matching for UI elements 2. **Recording & Playback** - Record user actions - Replay automation sequences 3. **Better Error Handling** - More descriptive error messages - Automatic retry logic 4. **Performance Optimization** - Connection pooling for CDP - Caching of DOM queries 5. **Additional Platforms** - Windows support - Linux support 6. **Advanced CDP Features** - Network interception - Console log capture - Performance profiling ## Known Limitations 1. **macOS Only:** Currently only supports macOS due to permission system and robotjs 2. **RobotJS Dependencies:** Requires native compilation (Xcode Command Line Tools) 3. **CDP Port:** Assumes default port 9222 (configurable) 4. **Screenshot Format:** Limited to PNG/JPEG for Electron, PNG for native 5. **Permission Prompts:** Some permissions require manual user action ## Conclusion Successfully delivered a fully functional MCP server that meets all requirements: ✅ See native Electron app DOM (via CDP) ✅ Click buttons in native Electron app (via robotjs) ✅ Trigger global hotkeys on macOS (via robotjs) The server provides 24 tools across 4 categories, comprehensive documentation, and a test application for validation. ## Files Created - Core implementation: 13 TypeScript files - Configuration: 5 files (package.json, tsconfig.json, eslint, prettier, gitignore) - Documentation: 3 files (README.md, USAGE.md, IMPLEMENTATION.md) - Example app: 4 files (package.json, main.js, index.html, README.md) **Total:** 25 files, ~2500 lines of code

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/aj47/electron-native-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server