Enables interaction with Android devices through ADB and Accessibility API, allowing for app navigation, UI interactions (click, swipe, type), and automated QA testing without requiring computer-vision pipelines.
Android-MCP is a lightweight, open-source bridge between AI agents and Android devices. Running as an MCP server, it lets large-language-model agents perform real-world tasks such as app navigation, UI interaction and automated QA testing without relying on traditional computer-vision pipelines or preprogramed scripts.
https://github.com/user-attachments/assets/cf9a5e4e-b69f-46d4-8487-0f61a7a86d67
✨ Key Features
- Native Android Integration
Interact with UI elements via ADB and the Android Accessibility API: launch apps, tap, swipe, input text, and read view hierarchies. - Bring Your Own LLM (Vision Optional)
Works with any language model—no fine-tuned CV model or OCR pipeline required. - Rich Toolset for Mobile Automation
Pre-built tools for gestures, keystrokes, capture, device state. - Real-Time Interaction
Typical latency between actions (e.g., two taps) ranges 2 – 5 s depending on device specs and load.
Supported Operating Systems
- Android 10+
Installation
Prerequisites
- Python 3.10+
- Android Studio
🏁 Getting Started
- Clone the repository
- Install dependencies
- Connect to the MCP server
Add the following JSON (replace {{PATH}}
placeholders) to your client config:
For Claude Desktop, save as %APPDATA%/Claude/claude_desktop_config.json
.
- Enable ADB & authorize your device
- Restart the Claude Desktop
Open your Claude Desktop, “Android-MCP” should now appear as an integration.
For troubleshooting tips (log locations, common ADB issues), see the MCP docs.
🛠️ MCP Tools
Claude can access the following tools to interact with Windows:
State-Tool
: To understand the state of the device.Click-Tool
: Click on the screen at the given coordinates.Long-Click-Tool
: Perform long click on the screen at the given coordinates.Type-Tool
: Type text on the specified coordinates (optionally clears existing text).Swipe-Tool
: Perform swipe from one location to other.Drag-Tool
: Drag from one point to another.Press-Tool
: To press the keys on the mobile device (Back, Volume Up, ...etc).Wait-Tool
: Pause for a defined duration.State-Tool
: Combined snapshot of active apps and interactive UI elements.Notification-Tool
: To access the notifications seen on the device.
⚠️ Caution
Android-MCP can execute arbitrary UI actions on your mobile device. Use it in controlled environments (emulators, test devices) when running untrusted prompts or agents.
🪪 License
This project is licensed under the MIT License. See LICENSE for details.
🤝 Contributing
Contributions are welcome! Please read CONTRIBUTING for dev setup and PR guidelines.
Made with ❤️ by Jeomon George
Citation
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
A lightweight bridge enabling AI agents to perform real-world tasks on Android devices such as app navigation, UI interaction, and automated QA testing without requiring computer-vision pipelines or preprogrammed scripts.
Related MCP Servers
- AsecurityAlicenseAqualityA bridge that connects physical hardware devices with AI large language models via serial communication, allowing users to control hardware using natural language commands.Last updated -35PythonMIT License
- -securityAlicense-qualityA comprehensive toolkit for building AI agents with blockchain capabilities, enabling interactions with multiple blockchain networks for tasks like wallet management, fund transfers, smart contract interactions, and cross-chain asset bridging.Last updated -2TypeScriptGPL 3.0
- -securityAlicense-qualityEmpowers AI agents to perform web browsing, automation, and scraping tasks with minimal supervision using natural language instructions and Selenium.Last updated -1PythonApache 2.0
- AsecurityAlicenseAqualityA TypeScript-based bridge between AI models and Android device functionality, enabling interaction with Android devices through ADB commands for tasks like app installation, file transfer, UI analysis, and shell command execution.Last updated -823JavaScriptMIT License