tap_element
Tap Android UI elements by resource ID, text, or content description to automate touch interactions during testing or control workflows.
Instructions
Tap a UI element by its resource-id, text, or content-desc. Finds the element in the UI tree and taps its center.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| by | Yes | How to find the element | |
| value | Yes | Value to match | |
| device_id | No | Device ID (optional if only one device) |
Implementation Reference
- src/index.ts:174-220 (handler)Implementation of the 'tap_element' tool handler which fetches the UI tree, finds the target element, and performs a tap operation.
server.tool( "tap_element", "Tap a UI element by its resource-id, text, or content-desc. Finds the element in the UI tree and taps its center.", { by: z .enum(["resource-id", "text", "content-desc"]) .describe("How to find the element"), value: z.string().describe("Value to match"), device_id: z.string().optional().describe("Device ID (optional if only one device)"), }, async ({ by, value, device_id }) => { const elements = await adb.getUiTree(device_id); const finder: Record<string, (el: (typeof elements)[0]) => boolean> = { "resource-id": (el) => el.resourceId === value || el.resourceId.endsWith(`:id/${value}`), text: (el) => el.text === value || el.text.toLowerCase().includes(value.toLowerCase()), "content-desc": (el) => el.contentDesc === value || el.contentDesc.toLowerCase().includes(value.toLowerCase()), }; const el = elements.find(finder[by]); if (!el) { return { content: [ { type: "text", text: `Element not found: ${by}="${value}". Available elements:\n${elements .filter((e) => e.clickable) .map((e) => ` id:${e.resourceId} text:"${e.text}" desc:"${e.contentDesc}"`) .join("\n")}`, }, ], isError: true, }; } await adb.tap(el.center.x, el.center.y, device_id); return { content: [ { type: "text", text: `Tapped "${by}=${value}" at (${el.center.x}, ${el.center.y}) [${el.className.split(".").pop()}]`, }, ], }; }, );