Skip to main content
Glama
zillow
by zillow

swipeOnElement

Perform swipe gestures on mobile app elements for automated testing. Specify direction, duration, and platform to simulate user interactions in Android and iOS applications.

Instructions

Swipe on a specific element

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
containerElementIdNoContainer element ID to restrict the search within
elementIdYesID of the element to swipe on
directionYesDirection to swipe
durationYesDuration of the swipe in milliseconds
platformYesPlatform of the device

Implementation Reference

  • Registration of the swipeOnElement tool in the ToolRegistry with name, description, schema, and handler.
    ToolRegistry.registerDeviceAware(
      "swipeOnElement",
      "Swipe on a specific element",
      swipeOnElementSchema,
      swipeOnElementHandler,
      true // Supports progress notifications
    );
  • Zod schema definition for swipeOnElement tool arguments.
    export const swipeOnElementSchema = z.object({
      containerElementId: z.string().optional().describe("Container element ID to restrict the search within"),
      elementId: z.string().describe("ID of the element to swipe on"),
      direction: z.enum(["up", "down", "left", "right"]).describe("Direction to swipe"),
      duration: z.number().describe("Duration of the swipe in milliseconds"),
      platform: z.enum(["android", "ios"]).describe("Platform of the device")
    });
  • Main handler function for the swipeOnElement tool. Finds the element by ID, instantiates SwipeOnElement class, and executes the swipe.
    const swipeOnElementHandler = async (device: BootedDevice, args: SwipeOnElementArgs, progress?: ProgressCallback) => {
      try {
        // Validate element ID format
        const elementId = args.elementId;
    
        // Check if the elementId looks like coordinate bounds instead of a resource ID
        const boundsPattern = /^\[\d+,\d+\]\[\d+,\d+\]$/;
        if (boundsPattern.test(elementId)) {
          throw new ActionableError(
            `Invalid element ID: "${elementId}" appears to be coordinate bounds. ` +
            `Please provide a proper Android resource ID (e.g., "com.example.app:id/button") ` +
            `or use swipeOnScreen with coordinates instead.`
          );
        }
    
        // Check if elementId is empty or just whitespace
        if (!elementId || elementId.trim().length === 0) {
          throw new ActionableError("Element ID cannot be empty. Please provide a valid Android resource ID.");
        }
    
        const observeScreen = new ObserveScreen(device);
        const swipeOnElement = new SwipeOnElement(device);
    
        // First observe to find the element
        const observeResult = await observeScreen.execute();
        if (!observeResult.viewHierarchy) {
          throw new ActionableError("Could not get view hierarchy to find element for swipe.");
        }
    
        const element = elementUtils.findElementByResourceId(
          observeResult.viewHierarchy,
          elementId,
          args.containerElementId, // Search within the specific container
          true // partial match
        );
    
        if (!element) {
          // Provide helpful suggestions
          const allResourceIds: string[] = [];
          const rootNodes = elementUtils.extractRootNodes(observeResult.viewHierarchy);
    
          for (const rootNode of rootNodes) {
            elementUtils.traverseNode(rootNode, (node: any) => {
              const nodeProperties = elementUtils.extractNodeProperties(node);
              if (nodeProperties["resource-id"] && nodeProperties["resource-id"].trim()) {
                allResourceIds.push(nodeProperties["resource-id"]);
              }
            });
          }
    
          const uniqueResourceIds = [...new Set(allResourceIds)].slice(0, 5); // Show first 5 unique IDs
          const suggestion = uniqueResourceIds.length > 0
            ? ` Available resource IDs include: ${uniqueResourceIds.join(", ")}`
            : " No elements with resource IDs found on current screen.";
    
          throw new ActionableError(
            `Element not found with ID "${elementId}".${suggestion} ` +
            `Use the 'observe' command to see the current view hierarchy and find valid element IDs.`
          );
        }
    
        const result = await swipeOnElement.execute(
          element,
          args.direction,
          { duration: args.duration ?? 100 },
          progress
        );
    
        return createJSONToolResponse({
          message: `Swiped ${args.direction} on element with ID "${elementId}"`,
          observation: result.observation
        });
      } catch (error) {
        throw new ActionableError(`Failed to swipe on element: ${error}`);
      }
    };
  • Core SwipeOnElement class extending BaseVisualChange, containing the execute method that performs the actual swipe gesture using calculated coordinates within element bounds.
    export class SwipeOnElement extends BaseVisualChange {
      private executeGesture: ExecuteGesture;
      private elementUtils: ElementUtils;
    
      constructor(device: BootedDevice, adb: AdbUtils | null = null, axe: Axe | null = null) {
        super(device, adb, axe);
        this.executeGesture = new ExecuteGesture(device, adb);
        this.elementUtils = new ElementUtils();
      }
    
      /**
       * Swipe on a specific element in a given direction
       * @param element - The element to swipe on
       * @param direction - Direction to swipe ('up', 'down', 'left', 'right')
       * @param options - Additional gesture options
       * @param progress - Optional progress callback
       * @returns Result of the swipe operation
       */
      async execute(
        element: Element,
        direction: "up" | "down" | "left" | "right",
        options: GestureOptions = {},
        progress?: ProgressCallback
      ): Promise<SwipeResult> {
        logger.info(`[SwipeOnElement] Starting swipe: direction=${direction}, platform=${this.device.platform}`);
        logger.info(`[SwipeOnElement] Element bounds: ${JSON.stringify(element.bounds)}`);
        logger.info(`[SwipeOnElement] Options: ${JSON.stringify(options)}`);
    
        return this.observedInteraction(
          async () => {
            logger.info(`[SwipeOnElement] In observedInteraction callback`);
    
            const { startX, startY, endX, endY } = this.elementUtils.getSwipeWithinBounds(
              direction,
              element.bounds
            );
    
            logger.info(`[SwipeOnElement] Raw swipe coordinates: start=(${startX}, ${startY}), end=(${endX}, ${endY})`);
    
            const flooredStartX = Math.floor(startX);
            const flooredStartY = Math.floor(startY);
            const flooredEndX = Math.floor(endX);
            const flooredEndY = Math.floor(endY);
    
            logger.info(`[SwipeOnElement] Floored swipe coordinates: start=(${flooredStartX}, ${flooredStartY}), end=(${flooredEndX}, ${flooredEndY})`);
    
            try {
              const result = await this.executeGesture.swipe(
                flooredStartX,
                flooredStartY,
                flooredEndX,
                flooredEndY,
                options
              );
              logger.info(`[SwipeOnElement] Swipe completed successfully: ${JSON.stringify(result)}`);
              return result;
            } catch (error) {
              logger.error(`[SwipeOnElement] Swipe execution failed: ${error}`);
              throw error;
            }
          },
          {
            changeExpected: false,
            timeoutMs: 500,
            progress
          }
        );
      }
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It states the action ('swipe') but doesn't disclose behavioral traits such as whether it requires prior element identification, what happens on failure (e.g., if element not found), or any side effects (e.g., UI changes). This is a significant gap for a mutation tool with zero annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with zero waste. It's front-loaded and appropriately sized for its content, though it could be more informative. Every word earns its place by stating the core action concisely.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, no output schema, and a mutation tool with 5 parameters, the description is incomplete. It lacks behavioral context (e.g., success/failure outcomes), usage guidelines, and doesn't compensate for the absence of structured fields, making it inadequate for reliable agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 5 parameters with descriptions and enums. The description adds no meaning beyond what the schema provides, such as explaining parameter interactions or usage examples. Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Swipe on a specific element' states a clear verb ('swipe') and target ('specific element'), but it's vague about what constitutes an 'element' (e.g., UI component) and doesn't distinguish it from sibling tools like 'swipe' or 'swipeOnScreen'. It provides basic purpose but lacks specificity for differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'swipe' or 'swipeOnScreen'. The description implies usage for swiping on elements, but there's no explicit context, exclusions, or prerequisites mentioned, leaving the agent to infer based on tool names alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/zillow/auto-mobile'

If you have feedback or need assistance with the MCP directory API, please join our Discord server