Skip to main content
Glama

Computer Control MCP

by AB498

Computer Control MCP

MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.

  • Only tested on Windows. Should work on other platforms.

MCP Computer Control Demo

Quick Usage (MCP Setup Using uvx)

Note: Running uvx computer-control-mcp@latest for the first time will download python dependencies (around 70MB) which may take some time. Recommended to run this in a terminal before using it as MCP. Subsequent runs will be instant.

{ "mcpServers": { "computer-control-mcp": { "command": "uvx", "args": ["computer-control-mcp@latest"] } } }

OR install globally with pip:

pip install computer-control-mcp

Then run the server with:

computer-control-mcp # instead of uvx computer-control-mcp, so you can use the latest version, also you can `uv cache clean` to clear the cache and `uvx` again to use latest version.

Features

  • Control mouse movements and clicks
  • Type text at the current cursor position
  • Take screenshots of the entire screen or specific windows with optional saving to downloads directory
  • Extract text from screenshots using OCR (Optical Character Recognition)
  • List and activate windows
  • Press keyboard keys
  • Drag and drop operations

Available Tools

Mouse Control

  • click_screen(x: int, y: int): Click at specified screen coordinates
  • move_mouse(x: int, y: int): Move mouse cursor to specified coordinates
  • drag_mouse(from_x: int, from_y: int, to_x: int, to_y: int, duration: float = 0.5): Drag mouse from one position to another

Keyboard Control

  • type_text(text: str): Type the specified text at current cursor position
  • press_key(key: str): Press a specified keyboard key

Screen and Window Management

  • take_screenshot(title_pattern: str = None, use_regex: bool = False, threshold: int = 60, with_ocr_text_and_coords: bool = False, scale_percent_for_ocr: int = 100, save_to_downloads: bool = False): Capture screen or window with optional OCR
  • get_screen_size(): Get current screen resolution
  • list_windows(): List all open windows
  • activate_window(title_pattern: str, use_regex: bool = False, threshold: int = 60): Bring specified window to foreground

Development

Setting up the Development Environment

# Clone the repository git clone https://github.com/AB498/computer-control-mcp.git cd computer-control-mcp # Install in development mode pip install -e . # Start server python -m computer_control_mcp.core

Running Tests

python -m pytest

API Reference

See the API Reference for detailed information about the available functions and classes.

License

MIT

For more information or help

-
security - not tested
A
license - permissive license
-
quality - not tested

local-only server

The server can only run on the client's local machine because it depends on local resources.

MCP server that provides computer control capabilities including mouse movements, keyboard actions, screenshot capture with OCR, and window management through a unified API.

  1. MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.
    1. Quick Usage (MCP Setup Using uvx)
      1. Features
        1. Available Tools
          1. Mouse Control
          2. Keyboard Control
          3. Screen and Window Management
        2. Development
          1. Setting up the Development Environment
          2. Running Tests
        3. API Reference
          1. License
            1. For more information or help

              Related MCP Servers

              • -
                security
                A
                license
                -
                quality
                An MCP server that bridges AI agents with GUI automation capabilities, allowing them to control mouse, keyboard, windows, and take screenshots to interact with desktop applications.
                Last updated -
                7
                Python
                MIT License
                • Apple
                • Linux
              • -
                security
                A
                license
                -
                quality
                Provides automated GUI testing and control capabilities through an MCP server that enables mouse movements, keyboard input, screen captures, and image recognition across Windows, macOS, and Linux.
                Last updated -
                25
                Python
                MIT License
                • Apple
                • Linux
              • A
                security
                A
                license
                A
                quality
                An MCP server providing web development tools such as screen capturing capabilities that let AI agents take and work with screenshots of the user's screen.
                Last updated -
                2
                437
                15
                MIT License
                • Apple
              • -
                security
                F
                license
                -
                quality
                An MCP server that allows users to interact with their browser through natural language commands, enabling actions like getting page content as markdown, modifying page styles, and searching browser history.
                Last updated -
                1
                TypeScript

              View all related MCP servers

              MCP directory API

              We provide all the information about MCP servers via our MCP API.

              curl -X GET 'https://glama.ai/api/mcp/v1/servers/AB498/computer-control-mcp'

              If you have feedback or need assistance with the MCP directory API, please join our Discord server