Provides comprehensive Windows system control including file operations, process management, window control, screenshots, clipboard access, PowerShell/CMD execution, and optional mouse/keyboard automation and browser control capabilities.
A Model Context Protocol server that provides desktop automation capabilities using RobotJS and screenshot capabilities, enabling LLMs to control mouse movements, keyboard inputs, and capture screenshots of the desktop environment.
Enables AI-driven testing and automation of Tauri desktop applications through natural language, allowing users to interact with UI elements, capture screenshots, execute commands, and test application flows without manual clicking or complex scripts.
Enables AI-powered web browsing automation using Google's Gemini 2.5 Computer Use API. Allows agents to navigate websites, click buttons, fill forms, and extract information through natural language commands with real-time progress tracking.