tap_element
Tap on Android UI elements using text or resource ID for reliable interaction during app testing and development.
Instructions
Tap on an element by text or resource ID. More reliable than raw coordinates.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | No | ||
| resource_id | No | ||
| device_serial | No |
Implementation Reference
- src/adb_mcp_server/server.py:399-423 (handler)The primary handler function for the 'tap_element' tool. It locates a UI element using text or resource ID via helper functions, computes its center coordinates, and performs a tap action using ADB's input tap command.@mcp.tool() def tap_element( text: str | None = None, resource_id: str | None = None, device_serial: str | None = None ) -> str: """ Tap on an element by text or resource ID. More reliable than raw coordinates. """ element = None if text: elements = find_element_by_text(text, partial_match=True, device_serial=device_serial) if elements: element = elements[0] elif resource_id: element = find_element_by_id(resource_id, device_serial=device_serial) if not element or 'center' not in element: return f"Element not found: text='{text}', resource_id='{resource_id}'" x, y = element['center']['x'], element['center']['y'] run_adb(["shell", "input", "tap", str(x), str(y)], device_serial) return f"Tapped element at ({x}, {y})"
- src/adb_mcp_server/server.py:310-352 (helper)Helper function used by tap_element to find UI elements by text content or content description, parsing the UI hierarchy XML to extract bounds and center coordinates.@mcp.tool() def find_element_by_text( text: str, partial_match: bool = True, device_serial: str | None = None ) -> list[dict]: """ Find UI elements containing specific text. Returns element details including tap coordinates. """ xml = get_ui_hierarchy(device_serial) elements = [] for match in re.finditer(r'<node[^>]*>', xml): node = match.group() text_match = re.search(r'text="([^"]*)"', node) desc_match = re.search(r'content-desc="([^"]*)"', node) found_text = text_match.group(1) if text_match else "" found_desc = desc_match.group(1) if desc_match else "" # Check if text matches matches = False if partial_match: matches = text.lower() in found_text.lower() or text.lower() in found_desc.lower() else: matches = text == found_text or text == found_desc if matches: element = {'text': found_text, 'content_desc': found_desc} bounds_match = re.search(r'bounds="\[(\d+),(\d+)\]\[(\d+),(\d+)\]"', node) if bounds_match: x1, y1 = int(bounds_match.group(1)), int(bounds_match.group(2)) x2, y2 = int(bounds_match.group(3)), int(bounds_match.group(4)) element['center'] = {'x': (x1 + x2) // 2, 'y': (y1 + y2) // 2} element['bounds'] = {'x1': x1, 'y1': y1, 'x2': x2, 'y2': y2} elements.append(element) return elements
- src/adb_mcp_server/server.py:354-386 (helper)Helper function used by tap_element to find UI elements by partial resource ID match, parsing UI hierarchy XML for coordinates.@mcp.tool() def find_element_by_id( resource_id: str, device_serial: str | None = None ) -> dict | None: """ Find a UI element by its resource ID. Returns element details including tap coordinates. """ xml = get_ui_hierarchy(device_serial) # Match partial resource ID (e.g., "button_submit" matches "com.app:id/button_submit") for match in re.finditer(r'<node[^>]*>', xml): node = match.group() id_match = re.search(r'resource-id="([^"]*)"', node) if id_match and resource_id in id_match.group(1): element = {'resource_id': id_match.group(1)} text_match = re.search(r'text="([^"]*)"', node) if text_match: element['text'] = text_match.group(1) bounds_match = re.search(r'bounds="\[(\d+),(\d+)\]\[(\d+),(\d+)\]"', node) if bounds_match: x1, y1 = int(bounds_match.group(1)), int(bounds_match.group(2)) x2, y2 = int(bounds_match.group(3)), int(bounds_match.group(4)) element['center'] = {'x': (x1 + x2) // 2, 'y': (y1 + y2) // 2} element['bounds'] = {'x1': x1, 'y1': y1, 'x2': x2, 'y2': y2} return element return None