Skip to main content
Glama
zillow
by zillow

swipeOnScreen

Perform automated screen swipes in specified directions for mobile app testing on Android and iOS devices.

Instructions

Swipe on screen in a specific direction

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
directionYesDirection to swipe
includeSystemInsetsYesWhether to include system inset areas in the swipe
durationYesDuration of the swipe in milliseconds
platformYesPlatform of the device

Implementation Reference

  • The main handler function registered for the 'swipeOnScreen' tool. It instantiates SwipeOnScreen class and calls its execute method with parsed arguments.
    const swipeOnScreenHandler = async (device: BootedDevice, args: SwipeOnScreenArgs, progress?: ProgressCallback) => {
      try {
        const swipeOnScreen = new SwipeOnScreen(device);
    
        const result = await swipeOnScreen.execute(
          args.direction,
          {
            duration: args.duration ?? 100,
            includeSystemInsets: args.includeSystemInsets
          },
          progress
        );
    
        return createJSONToolResponse({
          message: `Swiped ${args.direction} on screen${args.includeSystemInsets ? " including navigation areas" : ""}`,
          observation: result.observation,
          ...result
        });
      } catch (error) {
        throw new ActionableError(`Failed to swipe on screen: ${error}`);
      }
    };
  • Zod schema defining the input parameters for the swipeOnScreen tool.
    export const swipeOnScreenSchema = z.object({
      direction: z.enum(["up", "down", "left", "right"]).describe("Direction to swipe"),
      includeSystemInsets: z.boolean().describe("Whether to include system inset areas in the swipe"),
      duration: z.number().describe("Duration of the swipe in milliseconds"),
      platform: z.enum(["android", "ios"]).describe("Platform of the device")
    });
  • Registration of the swipeOnScreen tool in the ToolRegistry.
    ToolRegistry.registerDeviceAware(
      "swipeOnScreen",
      "Swipe on screen in a specific direction",
      swipeOnScreenSchema,
      swipeOnScreenHandler,
      true // Supports progress notifications
  • Core implementation class containing the execute method that performs the swipe gesture, respecting screen size and insets for both Android and iOS.
    export class SwipeOnScreen extends BaseVisualChange {
      private executeGesture: ExecuteGesture;
      private elementUtils: ElementUtils;
    
      constructor(device: BootedDevice, adb: AdbUtils | null = null, axe: Axe | null = null) {
        super(device, adb, axe);
        this.executeGesture = new ExecuteGesture(device, adb);
        this.elementUtils = new ElementUtils();
      }
    
      /**
       * Swipe on screen in a given direction
       * @param observeResult - Previous ObserveResult
       * @param direction - Direction to swipe ('up', 'down', 'left', 'right')
       * @param options - Additional gesture options
       * @param progress - Optional progress callback
       * @returns Result of the swipe operation
       */
      async executeAndroid(
        observeResult: ObserveResult,
        direction: "up" | "down" | "left" | "right",
        options: GestureOptions = {},
        progress?: ProgressCallback
      ): Promise<SwipeResult> {
        logger.info(`[SwipeOnScreen] In observedInteraction callback`);
    
        if (!observeResult.screenSize) {
          logger.error(`[SwipeOnScreen] No screen size available in observeResult`);
          throw new ActionableError("Could not determine screen size");
        }
    
        const screenWidth = observeResult.screenSize.width;
        const screenHeight = observeResult.screenSize.height;
        const insets = observeResult.systemInsets || { top: 0, right: 0, bottom: 0, left: 0 };
    
        logger.info(`[SwipeOnScreen] Screen dimensions: ${screenWidth}x${screenHeight}`);
        logger.info(`[SwipeOnScreen] System insets: ${JSON.stringify(insets)}`);
    
        // Calculate the bounds based on system insets
        const bounds = (options.includeSystemInsets === true)
          ? {
            left: 0,
            top: 0,
            right: screenWidth,
            bottom: screenHeight
          }
          : {
            left: insets.left,
            top: insets.top,
            right: screenWidth - insets.right,
            bottom: screenHeight - insets.bottom
          };
    
        logger.info(`[SwipeOnScreen] Calculated bounds: ${JSON.stringify(bounds)}`);
    
        const { startX, startY, endX, endY } = this.elementUtils.getSwipeWithinBounds(
          direction,
          bounds
        );
    
        logger.info(`[SwipeOnScreen] Raw swipe coordinates: start=(${startX}, ${startY}), end=(${endX}, ${endY})`);
    
        const flooredStartX = Math.floor(startX);
        const flooredStartY = Math.floor(startY);
        const flooredEndX = Math.floor(endX);
        const flooredEndY = Math.floor(endY);
    
        logger.info(`[SwipeOnScreen] Floored swipe coordinates: start=(${flooredStartX}, ${flooredStartY}), end=(${flooredEndX}, ${flooredEndY})`);
    
        try {
          const result = await this.executeGesture.swipe(
            flooredStartX,
            flooredStartY,
            flooredEndX,
            flooredEndY,
            options
          );
          logger.info(`[SwipeOnScreen] Swipe completed successfully: ${JSON.stringify(result)}`);
          return result;
        } catch (error) {
          logger.error(`[SwipeOnScreen] Swipe execution failed: ${error}`);
          throw error;
        }
      }
      /**
       * Swipe on screen in a given direction
       * @param observeResult - Previous ObserveResult
       * @param direction - Direction to swipe ('up', 'down', 'left', 'right')
       * @param options - Additional gesture options
       * @param progress - Optional progress callback
       * @returns Result of the swipe operation
       */
      async executeiOS(
        observeResult: ObserveResult,
        direction: "up" | "down" | "left" | "right",
        options: GestureOptions = {},
        progress?: ProgressCallback
      ): Promise<SwipeResult> {
        logger.info(`[SwipeOnScreen] In observedInteraction callback for iOS`);
    
        if (!observeResult.screenSize) {
          logger.error(`[SwipeOnScreen] No screen size available in observeResult`);
          throw new ActionableError("Could not determine screen size");
        }
    
        const screenWidth = observeResult.screenSize.width;
        const screenHeight = observeResult.screenSize.height;
        const insets = observeResult.systemInsets || { top: 0, right: 0, bottom: 0, left: 0 };
    
        logger.info(`[SwipeOnScreen] Screen dimensions: ${screenWidth}x${screenHeight}`);
        logger.info(`[SwipeOnScreen] System insets: ${JSON.stringify(insets)}`);
    
        // Calculate the bounds based on system insets
        const bounds = (options.includeSystemInsets === true)
          ? {
            left: 0,
            top: 0,
            right: screenWidth,
            bottom: screenHeight
          }
          : {
            left: insets.left,
            top: insets.top,
            right: screenWidth - insets.right,
            bottom: screenHeight - insets.bottom
          };
    
        logger.info(`[SwipeOnScreen] Calculated bounds: ${JSON.stringify(bounds)}`);
    
        const { startX, startY, endX, endY } = this.elementUtils.getSwipeWithinBounds(
          direction,
          bounds
        );
    
        logger.info(`[SwipeOnScreen] Raw swipe coordinates: start=(${startX}, ${startY}), end=(${endX}, ${endY})`);
    
        // Ensure coordinates are bounded by screen size and always positive
        const boundedStartX = Math.max(0, Math.min(Math.floor(startX), screenWidth - 1));
        const boundedStartY = Math.max(0, Math.min(Math.floor(startY), screenHeight - 1));
        const boundedEndX = Math.max(0, Math.min(Math.floor(endX), screenWidth - 1));
        const boundedEndY = Math.max(0, Math.min(Math.floor(endY), screenHeight - 1));
    
        logger.info(`[SwipeOnScreen] Bounded swipe coordinates: start=(${boundedStartX}, ${boundedStartY}), end=(${boundedEndX}, ${boundedEndY})`);
    
        try {
          const result = await this.axe.swipe(
            boundedStartX,
            boundedStartY,
            boundedEndX,
            boundedEndY
          );
          logger.info(`[SwipeOnScreen] Swipe completed successfully: ${JSON.stringify(result)}`);
          return result;
        } catch (error) {
          logger.error(`[SwipeOnScreen] Swipe execution failed: ${error}`);
          throw error;
        }
      }
      /**
       * Swipe on screen in a given direction
       * @param direction - Direction to swipe ('up', 'down', 'left', 'right')
       * @param options - Additional gesture options
       * @param progress - Optional progress callback
       * @returns Result of the swipe operation
       */
      async execute(
        direction: "up" | "down" | "left" | "right",
        options: GestureOptions = {},
        progress?: ProgressCallback
      ): Promise<SwipeResult> {
        logger.info(`[SwipeOnScreen] Starting swipe: direction=${direction}, platform=${this.device.platform}`);
        logger.info(`[SwipeOnScreen] Options: ${JSON.stringify(options)}`);
    
        return this.observedInteraction(
          async (observeResult: ObserveResult) => {
            logger.info(`[SwipeOnScreen] In observedInteraction callback`);
    
            switch (this.device.platform) {
              case "android":
                return this.executeAndroid(observeResult, direction, options, progress);
              case "ios":
                return this.executeiOS(observeResult, direction, options, progress);
            }
          }, {
            changeExpected: false,
            timeoutMs: 500,
            progress
          }
        );
      }
    }
  • Android-specific swipe execution logic, calculating bounds and performing the gesture.
    async executeAndroid(
      observeResult: ObserveResult,
      direction: "up" | "down" | "left" | "right",
      options: GestureOptions = {},
      progress?: ProgressCallback
    ): Promise<SwipeResult> {
      logger.info(`[SwipeOnScreen] In observedInteraction callback`);
    
      if (!observeResult.screenSize) {
        logger.error(`[SwipeOnScreen] No screen size available in observeResult`);
        throw new ActionableError("Could not determine screen size");
      }
    
      const screenWidth = observeResult.screenSize.width;
      const screenHeight = observeResult.screenSize.height;
      const insets = observeResult.systemInsets || { top: 0, right: 0, bottom: 0, left: 0 };
    
      logger.info(`[SwipeOnScreen] Screen dimensions: ${screenWidth}x${screenHeight}`);
      logger.info(`[SwipeOnScreen] System insets: ${JSON.stringify(insets)}`);
    
      // Calculate the bounds based on system insets
      const bounds = (options.includeSystemInsets === true)
        ? {
          left: 0,
          top: 0,
          right: screenWidth,
          bottom: screenHeight
        }
        : {
          left: insets.left,
          top: insets.top,
          right: screenWidth - insets.right,
          bottom: screenHeight - insets.bottom
        };
    
      logger.info(`[SwipeOnScreen] Calculated bounds: ${JSON.stringify(bounds)}`);
    
      const { startX, startY, endX, endY } = this.elementUtils.getSwipeWithinBounds(
        direction,
        bounds
      );
    
      logger.info(`[SwipeOnScreen] Raw swipe coordinates: start=(${startX}, ${startY}), end=(${endX}, ${endY})`);
    
      const flooredStartX = Math.floor(startX);
      const flooredStartY = Math.floor(startY);
      const flooredEndX = Math.floor(endX);
      const flooredEndY = Math.floor(endY);
    
      logger.info(`[SwipeOnScreen] Floored swipe coordinates: start=(${flooredStartX}, ${flooredStartY}), end=(${flooredEndX}, ${flooredEndY})`);
    
      try {
        const result = await this.executeGesture.swipe(
          flooredStartX,
          flooredStartY,
          flooredEndX,
          flooredEndY,
          options
        );
        logger.info(`[SwipeOnScreen] Swipe completed successfully: ${JSON.stringify(result)}`);
        return result;
      } catch (error) {
        logger.error(`[SwipeOnScreen] Swipe execution failed: ${error}`);
        throw error;
      }
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions swiping in a direction but doesn't explain effects like whether it simulates user input, requires device permissions, has side effects (e.g., changing app state), or handles errors. For a tool with 4 required parameters and no annotations, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with zero waste. It's front-loaded with the core action ('Swipe on screen') and specifies the key constraint ('in a specific direction'), making it easy to parse quickly. Every word earns its place without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a 4-parameter tool with no annotations and no output schema, the description is incomplete. It doesn't cover behavioral aspects like what the tool returns, error conditions, or prerequisites (e.g., device state). For a tool that likely interacts with device interfaces, more context is needed to guide effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, with clear descriptions for all parameters (e.g., 'direction' as direction to swipe, 'duration' in milliseconds). The description adds no additional meaning beyond the schema, such as explaining how parameters interact or default behaviors. Since the schema does the heavy lifting, the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Swipe on screen in a specific direction' clearly states the action (swipe) and target (screen), but it's vague about the scope and context. It doesn't specify whether this is for UI testing, device control, or another use case, and it doesn't distinguish itself from sibling tools like 'swipe' or 'swipeOnElement'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. With sibling tools like 'swipe' and 'swipeOnElement' available, there's no indication of how this tool differs—such as whether it's for general screen swipes, specific to certain apps, or requires particular device states. This leaves the agent without clear usage context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/zillow/auto-mobile'

If you have feedback or need assistance with the MCP directory API, please join our Discord server