Offers a community support channel through Discord for users to get help with the MCP server.
Hosts the project repository on GitHub where users can access code, documentation, and demonstration resources.
Uses ONNXRuntime for efficient machine learning model execution to power OCR capabilities in the MCP server.
Utilizes pytest for testing the MCP server components and functionality.
Built with Python libraries like PyAutoGUI for computer control functionality, with installation and execution through Python package management.
Computer Control MCP
MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.

Quick Usage (MCP Setup Using uvx)
Note:
OR install globally with pip:
Then run the server with:
Related MCP server: PyAutoGUI MCP Server
Features
Control mouse movements and clicks
Type text at the current cursor position
Take screenshots of the entire screen or specific windows with optional saving to downloads directory
Extract text from screenshots using OCR (Optical Character Recognition)
List and activate windows
Press keyboard keys
Drag and drop operations
Enhanced screenshot capture for GPU-accelerated windows (Windows only)
Note on GPU-accelerated Windows
Traditional screenshot methods like GDI/PrintWindow fail to capture GPU-accelerated windows, resulting in black screens. This impacts games, media players, Electron apps, browsers with GPU acceleration, streaming software, and CAD tools. Use WGC through take_screenshot tool's flag or ENV variable
Configuration
Custom Screenshot Directory
By default, screenshots are saved to the OS downloads directory. You can customize this by setting the COMPUTER_CONTROL_MCP_SCREENSHOT_DIR environment variable:
Or set it system-wide:
If the specified directory doesn't exist, the server will fall back to the default downloads directory.
Automatic WGC for Specific Windows
You can configure the system to automatically use Windows Graphics Capture (WGC) for specific windows by setting the COMPUTER_CONTROL_MCP_WGC_PATTERNS environment variable. This variable should contain comma-separated patterns that match window titles:
Or set it system-wide:
When this variable is set, any window whose title contains any of the specified patterns will automatically use WGC for screenshot capture, eliminating black screens for GPU-accelerated applications.
Available Tools
Mouse Control
click_screen(x: int, y: int): Click at specified screen coordinatesmove_mouse(x: int, y: int): Move mouse cursor to specified coordinatesdrag_mouse(from_x: int, from_y: int, to_x: int, to_y: int, duration: float = 0.5): Drag mouse from one position to anothermouse_down(button: str = "left"): Hold down a mouse button ('left', 'right', 'middle')mouse_up(button: str = "left"): Release a mouse button ('left', 'right', 'middle')
Keyboard Control
type_text(text: str): Type the specified text at current cursor positionpress_key(key: str): Press a specified keyboard keykey_down(key: str): Hold down a specific keyboard key until releasedkey_up(key: str): Release a specific keyboard keypress_keys(keys: Union[str, List[Union[str, List[str]]]]): Press keyboard keys (supports single keys, sequences, and combinations)
Screen and Window Management
take_screenshot(title_pattern: str = None, use_regex: bool = False, threshold: int = 60, scale_percent_for_ocr: int = None, save_to_downloads: bool = False, use_wgc: bool = False): Capture screen or windowtake_screenshot_with_ocr(title_pattern: str = None, use_regex: bool = False, threshold: int = 10, scale_percent_for_ocr: int = None, save_to_downloads: bool = False): Extract adn return text with coordinates using OCR from screen or windowget_screen_size(): Get current screen resolutionlist_windows(): List all open windowsactivate_window(title_pattern: str, use_regex: bool = False, threshold: int = 60): Bring specified window to foregroundwait_milliseconds(milliseconds: int): Wait for a specified number of milliseconds
Development
Setting up the Development Environment
Running Tests
API Reference
See the API Reference for detailed information about the available functions and classes.
License
MIT