Skip to main content
Glama

android-ui

Control Android device interfaces by performing tap, swipe, text input, key press, and app launch actions for UI automation and testing.

Instructions

Perform various UI interaction operations on an Android device.

Args: ctx: MCP Context. serial: Device serial number. action: The UI action to perform. x: X coordinate (for tap). y: Y coordinate (for tap). start_x: Starting X coordinate (for swipe). start_y: Starting Y coordinate (for swipe). end_x: Ending X coordinate (for swipe). end_y: Ending Y coordinate (for swipe). duration_ms: Duration of the swipe in milliseconds (default: 300). text: Text to input (for input_text). keycode: Android keycode to press (for press_key). package: Package name (for start_intent). activity: Activity name (for start_intent). extras: Optional intent extras (for start_intent).

Returns: A string message indicating the result of the operation.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
serialYes
actionYes
xNo
yNo
start_xNo
start_yNo
end_xNo
end_yNo
duration_msNo
textNo
keycodeNo
packageNo
activityNo
extrasNo

Implementation Reference

  • The primary handler function 'android_ui' decorated with @mcp.tool(name='android-ui'), implementing UI interactions (tap, swipe, input text, press key, start intent) on Android devices via dispatching to internal helper functions.
    @mcp.tool(name="android-ui") async def android_ui( # pylint: disable=too-many-arguments ctx: Context, serial: str, action: UIAction, x: int | None = None, y: int | None = None, start_x: int | None = None, start_y: int | None = None, end_x: int | None = None, end_y: int | None = None, duration_ms: int = 300, # Default for swipe text: str | None = None, keycode: int | None = None, package: str | None = None, activity: str | None = None, extras: dict[str, str] | None = None, ) -> str: """ Perform various UI interaction operations on an Android device. Args: ctx: MCP Context. serial: Device serial number. action: The UI action to perform. x: X coordinate (for tap). y: Y coordinate (for tap). start_x: Starting X coordinate (for swipe). start_y: Starting Y coordinate (for swipe). end_x: Ending X coordinate (for swipe). end_y: Ending Y coordinate (for swipe). duration_ms: Duration of the swipe in milliseconds (default: 300). text: Text to input (for input_text). keycode: Android keycode to press (for press_key). package: Package name (for start_intent). activity: Activity name (for start_intent). extras: Optional intent extras (for start_intent). Returns: A string message indicating the result of the operation. """ if action == UIAction.TAP: if x is None or y is None: msg = "Error: 'x' and 'y' coordinates are required for tap action." await ctx.error(msg) return msg return await _tap_impl(serial=serial, x=x, y=y, ctx=ctx) if action == UIAction.SWIPE: if start_x is None or start_y is None or end_x is None or end_y is None: msg = "Error: 'start_x', 'start_y', 'end_x', and 'end_y' are required for swipe action." await ctx.error(msg) return msg # duration_ms has a default, so no explicit None check needed if we pass it through return await _swipe_impl( serial=serial, start_x=start_x, start_y=start_y, end_x=end_x, end_y=end_y, ctx=ctx, duration_ms=duration_ms, ) if action == UIAction.INPUT_TEXT: if text is None: msg = "Error: 'text' is required for input_text action." await ctx.error(msg) return msg return await _input_text_impl(serial=serial, text=text, ctx=ctx) if action == UIAction.PRESS_KEY: if keycode is None: msg = "Error: 'keycode' is required for press_key action." await ctx.error(msg) return msg return await _press_key_impl(serial=serial, keycode=keycode, ctx=ctx) if action == UIAction.START_INTENT: if package is None or activity is None: msg = "Error: 'package' and 'activity' are required for start_intent action." await ctx.error(msg) return msg # extras is optional device_manager = get_device_manager() return await start_intent( ctx=ctx, serial=serial, package=package, activity=activity, device_manager=device_manager, extras=extras, ) # Should not be reached if action is a valid UIAction member unhandled_action_msg = f"Error: Unhandled UI action '{action}'." logger.error(unhandled_action_msg) await ctx.error(unhandled_action_msg) return unhandled_action_msg
  • UIAction Enum defining the supported actions for the 'android-ui' tool: TAP, SWIPE, INPUT_TEXT, PRESS_KEY, START_INTENT.
    class UIAction(Enum): """Actions available for UI automation.""" TAP = "tap" SWIPE = "swipe" INPUT_TEXT = "input_text" PRESS_KEY = "press_key" START_INTENT = "start_intent"

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hyperb1iss/droidmind'

If you have feedback or need assistance with the MCP directory API, please join our Discord server