tapOn
Tap on mobile app elements by text or ID to automate testing and interactions for Android and iOS devices.
Instructions
Tap supporting text or resourceId
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| containerElementId | No | Container element ID to restrict the search within | |
| action | Yes | Action to perform on the element | |
| text | No | Text to tap on | |
| id | No | Element ID to tap on | |
| platform | Yes | Platform of the device |
Implementation Reference
- src/server/interactionTools.ts:740-745 (registration)Registration of the 'tapOn' tool in the ToolRegistry, linking name, description, schema, and handler function."tapOn", "Tap supporting text or resourceId", tapOnSchema, tapOnHandler, true // Supports progress notifications );
- src/server/interactionTools.ts:239-253 (handler)The main handler function for the 'tapOn' tool. It instantiates TapOnElement and calls its execute method with parsed arguments, then formats the response.const tapOnHandler = async (device: BootedDevice, args: TapOnArgs, progress?: ProgressCallback) => { const tapOnTextCommand = new TapOnElement(device); const result = await tapOnTextCommand.execute({ containerElementId: args.containerElementId, text: args.text, elementId: args.id, action: args.action, }, progress); return createJSONToolResponse({ message: `Tapped on element`, observation: result.observation, ...result }); };
- Zod schema defining the input parameters for the 'tapOn' tool, including optional container, action, text, id, and required platform.export const tapOnSchema = z.object({ containerElementId: z.string().optional().describe("Container element ID to restrict the search within"), action: z.enum(["tap", "doubleTap", "longPress", "focus"]).describe("Action to perform on the element"), text: z.string().optional().describe("Text to tap on"), id: z.string().optional().describe("Element ID to tap on"), platform: z.enum(["android", "ios"]).describe("Platform of the device") });
- Core execution logic in TapOnElement.execute(). Finds element by text or ID, calculates tap point, handles focus check, performs platform-specific tap/doubleTap/longPress via ADB or Axe, wrapped in observedInteraction.async execute(options: TapOnElementOptions, progress?: ProgressCallback): Promise<TapOnElementResult> { if (!options.action) { return this.createErrorResult(options.action, "tap on action is required"); } try { // Tap on the calculated point using observedChange return await this.observedInteraction( async (observeResult: ObserveResult) => { const viewHierarchy = observeResult.viewHierarchy; if (!viewHierarchy) { return { success: false, error: "Unable to get view hierarchy, cannot tap on element" }; } const element = await this.findElementToTap(options, viewHierarchy); const tapPoint = this.elementUtils.getElementCenter(element); if (options.action === "focus") { // Check if element is already focused const isFocused = this.elementUtils.isElementFocused(element); if (isFocused) { logger.info(`Element is already focused, no action needed`); return { success: true, element: element, wasAlreadyFocused: true, focusChanged: false, x: tapPoint.x, y: tapPoint.y }; } // if not, change action to tap options.action = "tap"; } // Platform-specific tap execution switch (this.device.platform) { case "android": await this.executeAndroidTap(options.action, tapPoint.x, tapPoint.y); break; case "ios": await this.executeiOSTap(options.action, tapPoint.x, tapPoint.y); break; default: throw new ActionableError(`Unsupported platform: ${this.device.platform}`); } return { success: true, action: options.action, element, }; }, { queryOptions: { text: options.text, elementId: options.elementId, containerElementId: options.containerElementId }, changeExpected: false, timeoutMs: 800, // Reduce timeout for faster execution progress } ); } catch (error) { throw new ActionableError(`Failed to perform tap on element: ${error}`); } }