Skip to main content
Glama

tapOn

Perform actions like tap, double tap, long press, or focus on mobile UI elements using text or element IDs within specified containers. Supports Android and iOS platforms for mobile automation.

Instructions

Tap supporting text or resourceId

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
actionYesAction to perform on the element
containerElementIdNoContainer element ID to restrict the search within
idNoElement ID to tap on
platformYesPlatform of the device
textNoText to tap on

Implementation Reference

  • Registers the 'tapOn' MCP tool using ToolRegistry, linking the schema and handler function.
    "tapOn", "Tap supporting text or resourceId", tapOnSchema, tapOnHandler, true // Supports progress notifications
  • Zod schema validating input parameters for the tapOn tool: containerElementId, action, text, id, platform.
    export const tapOnSchema = z.object({ containerElementId: z.string().optional().describe("Container element ID to restrict the search within"), action: z.enum(["tap", "doubleTap", "longPress", "focus"]).describe("Action to perform on the element"), text: z.string().optional().describe("Text to tap on"), id: z.string().optional().describe("Element ID to tap on"), platform: z.enum(["android", "ios"]).describe("Platform of the device") });
  • Wrapper handler for tapOn tool that instantiates TapOnElement and calls its execute method, then formats the JSON response.
    const tapOnHandler = async (device: BootedDevice, args: TapOnArgs, progress?: ProgressCallback) => { const tapOnTextCommand = new TapOnElement(device); const result = await tapOnTextCommand.execute({ containerElementId: args.containerElementId, text: args.text, elementId: args.id, action: args.action, }, progress); return createJSONToolResponse({ message: `Tapped on element`, observation: result.observation, ...result }); };
  • Primary handler logic in TapOnElement.execute(): observes screen, finds element by text/id, calculates center tap point, handles focus check, executes platform-specific tap/doubleTap/longPress using ADB or Axe.
    async execute(options: TapOnElementOptions, progress?: ProgressCallback): Promise<TapOnElementResult> { if (!options.action) { return this.createErrorResult(options.action, "tap on action is required"); } try { // Tap on the calculated point using observedChange return await this.observedInteraction( async (observeResult: ObserveResult) => { const viewHierarchy = observeResult.viewHierarchy; if (!viewHierarchy) { return { success: false, error: "Unable to get view hierarchy, cannot tap on element" }; } const element = await this.findElementToTap(options, viewHierarchy); const tapPoint = this.elementUtils.getElementCenter(element); if (options.action === "focus") { // Check if element is already focused const isFocused = this.elementUtils.isElementFocused(element); if (isFocused) { logger.info(`Element is already focused, no action needed`); return { success: true, element: element, wasAlreadyFocused: true, focusChanged: false, x: tapPoint.x, y: tapPoint.y }; } // if not, change action to tap options.action = "tap"; } // Platform-specific tap execution switch (this.device.platform) { case "android": await this.executeAndroidTap(options.action, tapPoint.x, tapPoint.y); break; case "ios": await this.executeiOSTap(options.action, tapPoint.x, tapPoint.y); break; default: throw new ActionableError(`Unsupported platform: ${this.device.platform}`); } return { success: true, action: options.action, element, }; }, { queryOptions: { text: options.text, elementId: options.elementId, containerElementId: options.containerElementId }, changeExpected: false, timeoutMs: 800, // Reduce timeout for faster execution progress } ); } catch (error) { throw new ActionableError(`Failed to perform tap on element: ${error}`); } }
  • TypeScript interface for TapOnElement options used in the execute method.
    export interface TapOnElementOptions { // Element selection - one of these must be provided text?: string; elementId?: string; // Container to restrict search containerElementId?: string; // Action to perform action: "tap" | "doubleTap" | "longPress" | "focus"; }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/zillow/auto-mobile'

If you have feedback or need assistance with the MCP directory API, please join our Discord server