tapOn
Perform actions like tap, double tap, long press, or focus on mobile UI elements using text or element IDs within specified containers. Supports Android and iOS platforms for mobile automation.
Instructions
Tap supporting text or resourceId
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| action | Yes | Action to perform on the element | |
| containerElementId | No | Container element ID to restrict the search within | |
| id | No | Element ID to tap on | |
| platform | Yes | Platform of the device | |
| text | No | Text to tap on |
Implementation Reference
- src/server/interactionTools.ts:740-744 (registration)Registers the 'tapOn' MCP tool using ToolRegistry, linking the schema and handler function."tapOn", "Tap supporting text or resourceId", tapOnSchema, tapOnHandler, true // Supports progress notifications
- Zod schema validating input parameters for the tapOn tool: containerElementId, action, text, id, platform.export const tapOnSchema = z.object({ containerElementId: z.string().optional().describe("Container element ID to restrict the search within"), action: z.enum(["tap", "doubleTap", "longPress", "focus"]).describe("Action to perform on the element"), text: z.string().optional().describe("Text to tap on"), id: z.string().optional().describe("Element ID to tap on"), platform: z.enum(["android", "ios"]).describe("Platform of the device") });
- src/server/interactionTools.ts:239-253 (handler)Wrapper handler for tapOn tool that instantiates TapOnElement and calls its execute method, then formats the JSON response.const tapOnHandler = async (device: BootedDevice, args: TapOnArgs, progress?: ProgressCallback) => { const tapOnTextCommand = new TapOnElement(device); const result = await tapOnTextCommand.execute({ containerElementId: args.containerElementId, text: args.text, elementId: args.id, action: args.action, }, progress); return createJSONToolResponse({ message: `Tapped on element`, observation: result.observation, ...result }); };
- Primary handler logic in TapOnElement.execute(): observes screen, finds element by text/id, calculates center tap point, handles focus check, executes platform-specific tap/doubleTap/longPress using ADB or Axe.async execute(options: TapOnElementOptions, progress?: ProgressCallback): Promise<TapOnElementResult> { if (!options.action) { return this.createErrorResult(options.action, "tap on action is required"); } try { // Tap on the calculated point using observedChange return await this.observedInteraction( async (observeResult: ObserveResult) => { const viewHierarchy = observeResult.viewHierarchy; if (!viewHierarchy) { return { success: false, error: "Unable to get view hierarchy, cannot tap on element" }; } const element = await this.findElementToTap(options, viewHierarchy); const tapPoint = this.elementUtils.getElementCenter(element); if (options.action === "focus") { // Check if element is already focused const isFocused = this.elementUtils.isElementFocused(element); if (isFocused) { logger.info(`Element is already focused, no action needed`); return { success: true, element: element, wasAlreadyFocused: true, focusChanged: false, x: tapPoint.x, y: tapPoint.y }; } // if not, change action to tap options.action = "tap"; } // Platform-specific tap execution switch (this.device.platform) { case "android": await this.executeAndroidTap(options.action, tapPoint.x, tapPoint.y); break; case "ios": await this.executeiOSTap(options.action, tapPoint.x, tapPoint.y); break; default: throw new ActionableError(`Unsupported platform: ${this.device.platform}`); } return { success: true, action: options.action, element, }; }, { queryOptions: { text: options.text, elementId: options.elementId, containerElementId: options.containerElementId }, changeExpected: false, timeoutMs: 800, // Reduce timeout for faster execution progress } ); } catch (error) { throw new ActionableError(`Failed to perform tap on element: ${error}`); } }
- TypeScript interface for TapOnElement options used in the execute method.export interface TapOnElementOptions { // Element selection - one of these must be provided text?: string; elementId?: string; // Container to restrict search containerElementId?: string; // Action to perform action: "tap" | "doubleTap" | "longPress" | "focus"; }