Windows MCP Server
Comprehensive Windows automation MCP server for AI agents
Full control over Windows desktop applications with 25+ tools: screenshots, OCR, mouse/keyboard control, window management, process control, clipboard operations, and more.
Features
Screen Capture
Full screen screenshots
Window-specific capture
Region capture
OCR (Optical Character Recognition)
Full screen text extraction
Region-based OCR
Powered by Tesseract
Mouse Control
Click (left/right/middle)
Double-click
Drag and drop
Mouse movement with duration
Scroll (up/down)
Get mouse position
Keyboard Control
Type text with configurable speed
Press individual keys
Execute hotkey combinations (Ctrl+C, Alt+F4, etc.)
Full keyboard shortcuts support
Clipboard
Copy text to clipboard
Paste/read clipboard content
Seamless clipboard integration
Window Management
List all open windows
Focus/activate windows
Close windows
Minimize/maximize/restore
Resize windows
Move windows
Get window details (position, size, state)
Process Management
List running processes with PIDs
Filter processes by name
Kill processes by PID
Memory usage monitoring
Installation
Prerequisites
Python 3.10+ installed
Tesseract OCR for text recognition:
Install to default location or add to PATH
Verify:
tesseract --version
Install Package
Option 1: Install from PyPI (Recommended)
Option 2: Install from GitHub
Option 3: Install from source
Configuration
VS Code with GitHub Copilot
After installing via pip, add to your MCP configuration (%APPDATA%\Code\User\mcp.json):
Or install from VS Code MCP Extensions:
Open VS Code
Press
Ctrl+Shift+PType "MCP: Install Server"
Search for "Windows Automation Inspector"
Click Install
Claude Desktop
After installing via pip, add to %APPDATA%\Claude\claude_desktop_config.json:
Other MCP Clients
The server uses STDIO transport and works with any MCP-compatible client that supports stdio.
Usage Examples
Capture Screenshot
OCR Text Extraction
Automate UI Interactions
Keyboard Automation
Window Management
Process Control
Available Tools
Tool | Description |
| Capture full screen screenshot |
| Capture specific window by title |
| List all open windows with details |
| Extract text from full screen |
| Extract text from specified region |
| Click at coordinates (left/right/middle) |
| Double-click at coordinates |
| Drag from start to end coordinates |
| Type text at current position |
| Press keyboard key or shortcut |
| Execute hotkey combination |
| Copy text to clipboard |
| Get clipboard content |
| Get current mouse position |
| Move mouse to position |
| Scroll up/down |
| List running processes with PIDs |
| Terminate process by PID |
| Activate window |
| Close window by title |
| Minimize window |
| Maximize window |
| Restore window |
| Resize window |
| Move window position |
Security Considerations
WARNING: This server has powerful system control capabilities including:
Mouse and keyboard control
Process termination
Clipboard access
Screen capture
Only use in trusted environments where you control the MCP client.
Recommended Security Practices
Restrict Usage: Only enable when actively needed
Review Logs: Monitor all automated actions
Sandbox Testing: Test in isolated environments first
Access Control: Limit who can access the MCP client
Disable PyAutoGUI Failsafe: Server disables failsafe for automation - be cautious
Troubleshooting
Tesseract Not Found
Solution: Install Tesseract OCR from https://github.com/UB-Mannheim/tesseract/wiki
Permission Errors
Solution: Run VS Code or MCP client as Administrator for process control features
Module Not Found
Solution: Reinstall dependencies: pip install -e .
Window Not Found
Solution: Use partial window title matching. Check exact title with list_windows first.
Development
Project Structure
Dependencies
mcp: Model Context Protocol SDK
mss: Cross-platform screen capture
Pillow: Image processing
pyautogui: Mouse and keyboard automation
pygetwindow: Window management
pyperclip: Clipboard operations
pytesseract: OCR text extraction
psutil: Process management
License
MIT License - see LICENSE file
Contributing
Contributions welcome! Please:
Fork the repository
Create a feature branch
Make your changes
Submit a pull request
Links
Repository: https://github.com/RandyNorthrup/win32-mcp-server
Issues: https://github.com/RandyNorthrup/win32-mcp-server/issues
MCP Documentation: https://modelcontextprotocol.io/
Support
For bugs and feature requests, please use GitHub Issues.
Made for Windows automation and AI agents