Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| DEBUG | No | Enable debug logging | |
| ADB_PATH | No | Path to adb binary | |
| MAX_RETRIES | No | Auto-retry count | |
| ALLOWED_DEVICES | No | Comma-separated device IDs | |
| COMMAND_TIMEOUT_MS | No | ADB command timeout | |
| ALLOW_DESTRUCTIVE_OPS | No | Allow uninstall etc. | |
| MAX_ACTIONS_PER_MINUTE | No | Rate limit per device |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| list_devices | List all connected Android devices with their status and model information |
| get_device_info | Get detailed information about a connected Android device including model, Android version, SDK version, screen size, brand, and manufacturer |
| get_screen_size | Get the screen resolution (width x height in pixels) of the connected Android device |
| health_check | Check the health status of the MCP server and ADB connectivity. Returns server uptime, ADB availability, connected device count, and configuration summary. |
| tap | Tap at specific coordinates on the Android screen. Coordinates can be absolute pixels or normalized (0-1 range, will be scaled to screen size). |
| swipe | Swipe from one point to another on the Android screen. Useful for scrolling, swiping between pages, or drag gestures. |
| long_press | Long press at specific coordinates on the Android screen. Useful for context menus, drag initiation, or selection. |
| double_tap | Double tap at specific coordinates on the Android screen. Useful for zooming or selecting text. |
| input_text | Type text into the currently focused input field on the Android device. The field must already be focused (tapped). |
| press_key | Press a hardware/software key on the Android device. Common keycodes: HOME, BACK, MENU, ENTER, VOLUME_UP, VOLUME_DOWN, POWER, DEL, TAB, SPACE, APP_SWITCH, SEARCH, CAMERA. |
| list_apps | List installed applications on the Android device. By default shows third-party apps only. |
| open_app | Launch an Android application by its package name (e.g. com.google.android.youtube). The app must be installed. |
| close_app | Force-stop an Android application by its package name. |
| install_apk | Install an APK file onto the Android device from a local file path. |
| uninstall_app | Uninstall an application from the Android device. Requires allowDestructiveOps to be enabled in server configuration. |
| get_current_app | Get the package name of the currently focused/foreground Android application. |
| list_files | List files and directories at a given path on the Android device. Returns file names, types, sizes, and permissions. |
| pull_file | Download a file from the Android device to the local machine. |
| push_file | Upload a file from the local machine to the Android device. |
| get_ui_tree | Capture the current UI hierarchy from the Android screen. Returns a structured representation of all visible UI elements with their properties, text, bounds, and states. This is the primary way to understand what is currently on screen. |
| find_element | Find a UI element on the Android screen matching the given selector criteria. Returns the first matching element with its properties and bounds. Use this to locate buttons, text fields, or any UI component before interacting with it. |
| click_element | Find a UI element matching the selector and tap its center. This is the preferred way to interact with UI elements rather than using raw coordinates. |
| wait_for_element | Wait for a UI element to appear on screen within a timeout period. Useful after navigation, loading screens, or animations. Polls the UI tree at regular intervals until the element is found. |
| assert_element_exists | Check whether a UI element exists on the current screen. Returns true/false without throwing an error. Useful for test assertions and conditional logic. |
| capture_screenshot | Capture a screenshot of the current Android device screen. Returns a base64-encoded PNG image that can be displayed or analyzed visually. Use this to see what is currently on screen. |
| analyze_screen | Capture a screenshot and analyze the screen content. Returns both the visual screenshot and a text summary of the UI tree for comprehensive screen understanding. Use this when you need to understand the full screen context. |
| detect_elements_visually | Detect interactive elements on the screen using a combination of screenshots and UI tree analysis. Returns a list of all clickable/focusable elements with their coordinates and descriptions. Use this when you need to find elements to interact with. |
| compare_screenshots | Compare the current screen with the previously captured screenshot to detect changes. Returns the percentage of pixels that changed and whether the screen has significantly changed. Useful for verifying that an action had an effect. |
| smart_click | Intelligently click an element using multiple strategies: UIAutomator → Accessibility → Vision → Coordinates fallback. This is the most reliable way to click elements. |
| run_test_scenario | Execute an ordered test scenario with multiple steps and assertions. Each step performs an action and optionally verifies the result. Returns detailed pass/fail results per step. |
| start_recording | Start recording actions to create a macro. All subsequent tool calls will be recorded. Use stop_recording to save. |
| stop_recording | Stop the active recording and save it to disk. |
| replay_recording | Replay a previously recorded action sequence. |
| list_recordings | List all saved action recordings/macros. |
| get_device_state | Get the current tracked state of a device including current app, screen history, and recent actions. Useful for understanding context. |
| get_metrics | Get performance metrics and action history for the MCP server. Shows per-tool latency, success/failure counts. |
| list_plugins | List all loaded MCP server plugins and their versions. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |