Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| ADB_ALLOW_SHELL | No | Determines whether the server allows execution of arbitrary ADB shell commands, e.g., 'true'. | |
| ADB_EXECUTION_MODE | No | Sets the execution mode for ADB commands, e.g., 'unrestricted'. |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| get_ui_hierarchy | Returns the current Android screen as an XML UI hierarchy. Use this when you need to locate element coordinates, read text, or find resource IDs to interact with. Do not call this after every action — only call it when you actually need to read screen content. To check if the screen changed after an action, use snapshot_ui before the action and detect_ui_change after. |
| tap_screen | A tap gesture at the given pixel coordinates on the Android screen. |
| swipe_screen | A swipe gesture on the Android screen from (x1,y1) to (x2,y2) over the given duration. |
| type_text | Text input into the currently focused Android input field. Spaces are encoded automatically. |
| press_key | A key event sent to the Android device using the given keycode integer. |
| take_screenshot | A PNG screenshot of the current Android device screen with width, height, and base64-encoded image data. |
| list_devices | All Android devices currently visible to ADB, with their serial numbers, connection state, and model names. |
| launch_app | An Android app launched by its component name (package/activity). |
| execute_adb_command | The output of an ADB command. Parsed safely via shlex and never passed to a system shell. |
| snapshot_ui | Takes a lightweight snapshot of the current UI state and returns a short token. Use this before performing an action (tap, swipe, launch, key press) when you only need to confirm the screen changed afterward — not read its content. Pass the returned token to detect_ui_change as baseline_token. This avoids loading the full XML hierarchy into context unnecessarily. Do not use this when you need to read or interact with screen elements — use get_ui_hierarchy for that. |
| detect_ui_change | Polls the UI hierarchy after an action and returns when the screen content changes or the timeout is reached. Returns changed status and elapsed time. By default, omits the XML hierarchy for efficiency — set return_hierarchy=True to receive the full hierarchy. For efficient change detection: call snapshot_ui before the action, perform the action, then call detect_ui_change with baseline_token. Only use without baseline_token when you need to wait for a slow transition (loading screens, animations). Do not use to read the current screen state — use get_ui_hierarchy for that. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |