Offers a community support channel through Discord for users to get help with the MCP server.
Hosts the project repository on GitHub where users can access code, documentation, and demonstration resources.
Uses ONNXRuntime for efficient machine learning model execution to power OCR capabilities in the MCP server.
Utilizes pytest for testing the MCP server components and functionality.
Built with Python libraries like PyAutoGUI for computer control functionality, with installation and execution through Python package management.
Computer Control MCP
MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.
Only tested on Windows. Should work on other platforms.
Quick Usage (MCP Setup Using uvx
)
Note:
OR install globally with pip
:
Then run the server with:
Features
Control mouse movements and clicks
Type text at the current cursor position
Take screenshots of the entire screen or specific windows with optional saving to downloads directory
Extract text from screenshots using OCR (Optical Character Recognition)
List and activate windows
Press keyboard keys
Drag and drop operations
Available Tools
Mouse Control
click_screen(x: int, y: int)
: Click at specified screen coordinatesmove_mouse(x: int, y: int)
: Move mouse cursor to specified coordinatesdrag_mouse(from_x: int, from_y: int, to_x: int, to_y: int, duration: float = 0.5)
: Drag mouse from one position to another
Keyboard Control
type_text(text: str)
: Type the specified text at current cursor positionpress_key(key: str)
: Press a specified keyboard key
Screen and Window Management
take_screenshot(title_pattern: str = None, use_regex: bool = False, threshold: int = 60, with_ocr_text_and_coords: bool = False, scale_percent_for_ocr: int = 100, save_to_downloads: bool = False)
: Capture screen or window with optional OCRget_screen_size()
: Get current screen resolutionlist_windows()
: List all open windowsactivate_window(title_pattern: str, use_regex: bool = False, threshold: int = 60)
: Bring specified window to foreground
Development
Setting up the Development Environment
Running Tests
API Reference
See the API Reference for detailed information about the available functions and classes.
License
MIT
For more information or help
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
MCP server that provides computer control capabilities including mouse movements, keyboard actions, screenshot capture with OCR, and window management through a unified API.
- MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.
- Quick Usage (MCP Setup Using uvx)
- Features
- Available Tools
- Development
- API Reference
- License
- For more information or help
Related MCP Servers
- -securityAlicense-qualityAn MCP server that bridges AI agents with GUI automation capabilities, allowing them to control mouse, keyboard, windows, and take screenshots to interact with desktop applications.Last updated -10MIT License
- -securityAlicense-qualityProvides automated GUI testing and control capabilities through an MCP server that enables mouse movements, keyboard input, screen captures, and image recognition across Windows, macOS, and Linux.Last updated -27MIT License
- AsecurityAlicenseAqualityAn MCP server providing web development tools such as screen capturing capabilities that let AI agents take and work with screenshots of the user's screen.Last updated -22815MIT License
- -securityFlicense-qualityAn MCP server that allows users to interact with their browser through natural language commands, enabling actions like getting page content as markdown, modifying page styles, and searching browser history.Last updated -1