Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@mcp-autogui-multinodeTake a screenshot and tell me what applications are currently open."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
About
An MCP and HTTP server wrapper for PyAutoGUI, enabling LLMs to control your mouse and keyboard.
Architecture
The service supports two deployment architectures:
LLM -> MCP -> TOOL (Remote Tool Service)
This architecture separates the MCP server from the tool service, allowing the MCP server to connect to a remote tool service via HTTP.
Characteristics:
MCP server uses client-based tools (
register_computer_tools_with_client)MCP server forwards requests to remote tool service via HTTP
Tool service performs actual computer control operations
Suitable for distributed deployments where MCP server and tool service run on different machines
Requires
endpointparameter in MCP tool calls
Architecture 2: LLM -> MCP (Direct Tools)
This architecture uses direct tools where the MCP server directly performs computer control operations.
Characteristics:
MCP server uses direct tools (
register_computer_tools)MCP server directly executes computer control operations
No separate tool service required
Suitable for local deployments where everything runs on the same machine
No
endpointparameter needed in MCP tool calls
Features
π Dual Protocol Support: HTTP REST API and MCP (Model Context Protocol)
π API Key Authentication: Optional API key authentication for service-to-service communication
π Multiple MCP Transports: Support for both HTTP and stdio (Standard Input/Output) transport modes
π±οΈ Mouse Control: Move, click, drag, scroll operations
β¨οΈ Keyboard Control: Press keys, type text, key combinations
πΈ Screenshot: Capture screen and get base64-encoded images
π Screen Info: Get cursor position and screen resolution
βοΈ Configuration Management: Pydantic Settings with environment variable support
π Auto Documentation: Swagger UI for HTTP API
π§ Flexible Deployment: Run HTTP server or MCP server independently
π Request Tracing: Request ID middleware for request tracking
π Structured Logging: Loguru-based logging with request ID integration
π Remote MCP Support: Optional HTTP client for remote tool server integration
Quick Start
Prerequisites
Python >= 3.12
uvpackage manager (recommended)
Installation
Clone the repository:
Install dependencies based on your deployment scenario:
Local Full Development
For local development with all features (GUI control + testing):
Deploy MCP Server Only
For deploying MCP server that connects to remote tool service (no GUI dependencies needed):
Deploy Tool Service Only
For deploying HTTP tool service that performs actual computer control (requires GUI):
Running the Service
The service supports two independent servers:
1. Run Tool Service (HTTP API)
Starts the HTTP API server for computer control:
2. Run MCP Server
Starts the MCP server that can connect to remote tool services. The server supports two transport modes:
HTTP Transport Mode:
stdio Transport Mode (default):
After starting, you can access:
HTTP API Documentation: http://localhost:8000/docs
Health Check: http://localhost:8000/health
MCP Endpoint: http://localhost:8001/mcp (if using HTTP transport)
API Endpoints
Base Endpoints
GET /- Root path, returns API informationGET /health- Health check endpoint
Computer Control Endpoints
All computer control actions are available at:
POST /api/computer/{action}- Execute a computer control actionGET /api/computer/actions- List all available actions
Available Actions
Action | Description | Parameters |
| Move mouse cursor |
|
| Click mouse button |
|
| Press mouse button (hold) |
|
| Release mouse button |
|
| Drag mouse from source to target |
|
| Scroll mouse wheel |
|
| Press keyboard key(s) |
|
| Type text (uses clipboard) |
|
| Wait for duration |
|
| Capture screen | (no parameters) |
| Get mouse position | (no parameters) |
| Get screen resolution | (no parameters) |
Example API Usage
Move Mouse
Click Mouse
Take Screenshot
Get Cursor Position
Note: If API_KEY_ENABLED=false, the X-API-Key header is optional. If API_KEY_ENABLED=true, the header is required for all requests except health checks and documentation endpoints.
MCP Tools
The service also exposes all computer control operations as MCP tools. When running the MCP server, you can use these tools through any MCP-compatible client.
Available MCP Tools
All HTTP API actions are available as MCP tools. The MCP tool names use snake_case, while the HTTP API uses PascalCase:
move_mouse- Move mouse cursor (HTTP:MoveMouse)click_mouse- Click mouse button (HTTP:ClickMouse)press_mouse- Press mouse button (HTTP:PressMouse)release_mouse- Release mouse button (HTTP:ReleaseMouse)drag_mouse- Drag mouse (HTTP:DragMouse)scroll- Scroll mouse wheel (HTTP:Scroll)press_key- Press keyboard key (HTTP:PressKey)type_text- Type text (HTTP:TypeText)wait- Wait for duration (HTTP:Wait)take_screenshot- Take screenshot (HTTP:TakeScreenshot)get_cursor_position- Get cursor position (HTTP:GetCursorPosition)get_screen_size- Get screen size (HTTP:GetScreenSize)
MCP Transport Modes
The MCP server supports two transport modes:
stdio (default): Standard input/output transport
Used for local communication via stdin/stdout
Suitable for direct integration with MCP clients
Start with:
python mcp_local.py stdio
http: HTTP-based transport with stateless mode
Used for remote communication over HTTP
Suitable for service-to-service communication
Start with:
python mcp_local.py httpAccessible at:
http://localhost:8001/mcp
MCP Tool Registration Modes
The service supports two modes of MCP tool registration:
Direct Tools (
register_computer_tools): Tools that directly call the local computer control implementation. Noendpointparameter required.Used in
mcp_local.pyfor local MCP serverTools execute computer control actions directly
Client-based Tools (
register_computer_tools_with_client): Tools that use an HTTP client to call a remote tool server. Requires anendpointparameter.Used in
mcp_server/register.pyfor remote MCP serverTools forward requests to a remote tool service via HTTP
The local MCP server (mcp_local.py) uses direct tools by default. The remote MCP server uses client-based tools.
Code Style
Use type hints for all function parameters and return types
Follow PEP 8 style guidelines
Use descriptive docstrings for all public functions
Keep functions focused and single-purpose
Security Considerations
β οΈ Warning: This service provides direct control over your computer's mouse and keyboard. Use with caution:
Only run on trusted networks
Restrict CORS origins in production (currently allows all origins)
Enable API Key Authentication: Set
API_KEY_ENABLED=trueand configure a strongAPI_KEYin productionBe aware of the security implications of remote computer control
API Key Authentication
The service supports optional API key authentication for securing service-to-service communication:
Enable Authentication: Set
API_KEY_ENABLED=truein your.envfileSet API Key: Configure
API_KEY=your-secret-api-key-herein your.envfilePass API Key in Requests: Include the API key in request headers:
X-API-Key: your-secret-api-key-here(recommended)Authorization: Bearer your-secret-api-key-here(alternative)
Excluded Paths (no authentication required):
/health- Health check endpoint/docs- API documentation/openapi.json- OpenAPI schema/redoc- Alternative API documentation
MCP Client Usage with API Key:
Logging
The service uses Loguru for structured logging with the following features:
Request ID Tracking: Each request gets a unique ID that appears in all log entries
Environment-aware: Console output in development, file logging in production
Structured Format: Includes timestamp, level, request ID, module, function, and line number
Log files are stored in the logs/ directory:
app_YYYY-MM-DD.log: General application logserror_YYYY-MM-DD.log: Error logs only
In development mode, logs are only output to the console. In production mode, logs are written to both console and files.
Testing
Run tests using pytest:
The test suite includes:
test_local_mcp_client.py: Tests for local MCP server with HTTP transport (direct tools)test_stdio_mcp_client.py: Tests for local MCP server with stdio transport (direct tools)test_mcp_client.py: Tests for remote MCP server with client-based tools (requires endpoint parameter)
Troubleshooting
Port Already in Use
If you get a port already in use error:
MCP Connection Issues
For HTTP transport, ensure the MCP server is running and accessible:
For stdio transport, ensure the MCP server is started with stdio mode:
API Key Authentication Issues
If you're getting authentication errors:
Verify
API_KEY_ENABLEDis set correctly in.envCheck that
API_KEYmatches between client and serverEnsure the API key is passed in the
X-API-Keyheader orAuthorization: Bearer <key>headerCheck that the request path is not in the excluded paths list
Screenshot Issues
If screenshot functionality fails:
Check Python version compatibility (requires Python >= 3.12)
Verify display permissions on macOS/Linux
Ensure PyAutoGUI and its dependencies are properly installed
License
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.