What can you do with this server?

This server provides an MCP interface to control and automate Android devices (real or emulated) via ADB and UIAutomator, enabling AI agents to see the screen, read UI elements, and perform interactions — no device-side app required. Device Management * List connected devices (list_targets) — serial numbers, state, and model * Get detailed device info (status) — state, system properties, and battery level Screen Reading * Capture a screenshot (screenshot) — returns the current screen as a PNG image * Describe the screen (describe_screen) — parses the live UIAutomator UI tree into a flat list of elements with labels and exact tap coordinates (x, y) Input & Interaction * Tap at specific screen coordinates (tap) * Swipe from one coordinate to another with optional duration (swipe) * Long press at a coordinate with optional duration (long_press) * Type arbitrary text including spaces, special characters, and unicode (type_text) Key Events * Press named keys like home, back, enter, or recents (press_key, press_home, press_back) App & URL Launch * Launch an app by package name or friendly name (launch_app) * Open any HTTP/HTTPS URL in the device's default browser (open_url) All tools accept an optional serial parameter to target a specific device when multiple are connected. describe_screen returns exact element coordinates that feed directly into tap, enabling deterministic interactions without OCR guessing.

Which integrations are available for this server?

Allows AI agents to control physical Android devices and emulators via adb, providing tools for screen capture, UI tree inspection, input events (tap, swipe, type), app launching, and device status.

How do I use androir-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@androir-mcp what's on the screen?" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

androir-mcp

by benasbarciauskas

Overview Schema Related Servers Score Discussions

TypeScript

Local

androir-mcp

Drive real Android devices from any AI agent — see the screen, tap, swipe, type, over adb.

An independent, standalone Model Context Protocol server that gives an AI agent a clean, safe automation surface for physical Android devices and emulators: capture the screen, read the live UI tree, and drive input — backed entirely by adb + uiautomator, no native code and no device-side app.

License Node TypeScript MCP PRs welcome Stars

NOTE

Requiresadb (Android platform-tools) and an Android device with USB debugging enabled (or an emulator). The server runs locally, holds no credentials, and talks only to the device(s) adb already sees.

✨ What is androir-mcp?

androir-mcp is a standalone MCP server for automating Android over adb. It is its own project — not a copy or port of anything — exposing a small, conventional screen-read + input tool set that drops into any MCP-based agent loop.

Everything is backed by adb and uiautomator: there is no native code and nothing to install on the device. The agent shells out to adb (always as an argv array — never a shell string) to capture screenshots, dump the UI hierarchy, and send input events.

The key advantage on Android is that describe_screen returns the real UI tree — exact element bounds and text straight from uiautomator — rather than guessing from OCR. Tap coordinates come back as element centers and feed straight into tap, so an agent can read a screen and act on it deterministically.

Related MCP server: scrcpy-mcp

🚀 Features

📸 screenshot — capture the device screen as a PNG (signature-validated).
🌳 describe_screen — parse the uiautomator UI tree into a flat list of labels with center tap coordinates (text / content-desc / class, entity-decoded).
👆 tap / swipe / long_press — coordinate input in device pixels.
⌨️ type_text — type arbitrary text safely (spaces, metacharacters, and unicode all type verbatim).
🔘 press_key / press_home / press_back — named key events.
📱 launch_app — launch by package name or a friendly name (resolved from the installed package list, cached per device).
🔗 open_url — open an http(s) URL in the default browser.
🔎 list_targets / status — enumerate connected devices and read device state, properties, and battery.

Safety by construction:

🛡️ argv-only adb — every command is an argument array, never a shell string, and any value handed to the device shell is single-quoted for it, so there is no shell-injection surface.
✅ strict serial validation — serials are checked against [A-Za-z0-9.:_-] (≤ 128 chars) before reaching any subprocess.
⏱️ per-call timeout (default 30 s) with process-group kill on timeout.
🤫 scrubbed errors — concise messages only; no raw adb stderr (which can leak serials/paths) and no host stack traces reach the model.

🛠️ Tools

Tool	adb implementation	Returns
`list_targets`	`adb devices -l`	serials + model + state
`status`	`adb -s S get-state` (+ props, battery)	device/offline/unauthorized + info
`screenshot`	`adb -s S exec-out screencap -p`	PNG image content
`describe_screen`	`adb -s S exec-out uiautomator dump /dev/tty` (fallback: dump to `/sdcard` then `exec-out cat`) → parse XML	element list: label / center tap (x,y)
`tap` (x,y)	`adb -s S shell input tap X Y`	confirmation
`swipe` (x1,y1,x2,y2,dur_ms?)	`adb -s S shell input swipe X1 Y1 X2 Y2 DUR`	confirmation
`long_press` (x,y,dur_ms?)	`input swipe X Y X Y DUR` (same point)	confirmation
`type_text` (text)	`adb -s S shell input text '<quoted>'` (space→`%s`, single-quoted for the device shell)	confirmation
`press_key` (key)	`adb -s S shell input keyevent <KEYCODE>` — names: home→3, back→4, enter→66, recents→187	confirmation
`press_home`	`input keyevent 3`	confirmation
`press_back`	`input keyevent 4`	confirmation
`launch_app` (name or pkg)	resolve name→package (from `pm list packages`), then `monkey -p PKG -c android.intent.category.LAUNCHER 1`	confirmation
`open_url` (url)	`am start -a android.intent.action.VIEW -d '<url>'` (http/https only)	confirmation

All tools take an optional serial; it defaults to the single connected device and errors if the choice is ambiguous. Coordinates are in device pixels, so describe_screen tap points feed straight into tap with no translation.

🏁 Getting started

Prerequisites

adb (Android platform-tools) on your PATH.
An Android device with USB debugging enabled, or a running emulator. Confirm it's visible:
```
adb devices
```

Build

npm install
npm run build

Attach to an MCP client

Add the built server to your MCP client config:

{
  "mcpServers": {
    "androir": {
      "command": "node",
      "args": ["/path/to/androir-mcp/dist/index.js"]
    }
  }
}

Verify

With a device connected, run the self-check (it lists targets, takes a screenshot, and dumps the UI tree):

npm run selfcheck

It prints PASS when the three core tools work end-to-end against a real device.

🤖 Use with an AI agent

Once androir is attached to your MCP client, give the agent a goal and let it read the screen and act:

You: Open the Settings app, go to Wi-Fi, and tell me which network is connected.
Agent: calls launch_app("settings") → describe_screen() (reads the labels + tap coordinates) → tap(x, y) on "Wi-Fi" → describe_screen() again → reports the connected network.

Because describe_screen returns exact element bounds and text, the agent taps real coordinates rather than guessing from a screenshot.

🗺️ Status & roadmap

Core tools — screenshot, describe_screen, tap, swipe, long_press, type_text, press_key/home/back, launch_app, open_url, list_targets, status
uiautomator XML parsing → labels + center tap coordinates (entity-decoded, malformed-XML tolerant)
Device-shell-safe input escaping (no shell injection) + strict serial validation
Unit tests for the XML parser and the input-escaping logic
Pre-push secret-scan hook
Recording / replay skills
Multi-device parallel control
Emulator-specific paths
Published npm package / npx bin

This README and roadmap fill in as the project progresses.

📄 License

Apache-2.0.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

View all tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/benasbarciauskas/androir-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server