swipe
Scroll in a specified direction and speed within a container element on Android or iOS devices, while optionally searching for a specific element or text. Supports up, down, left, and right movements.
Instructions
Unified scroll command supporting direction and speed (no index support due to reliability)
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| containerElementId | Yes | Element ID to scroll until visible | |
| direction | Yes | Scroll direction | |
| lookFor | No | What we're searching for while scrolling | |
| platform | Yes | Platform of the device | |
| speed | No | Scroll speed |
Implementation Reference
- src/server/interactionTools.ts:762-768 (registration)Registration of the 'swipe' tool in the ToolRegistry, using the scrollSchema for input validation and scrollHandler as the execution function.ToolRegistry.registerDeviceAware( "shake", "Shake the device", shakeSchema, shakeHandler, true // Supports progress notifications
- src/server/interactionTools.ts:430-545 (handler)The scrollHandler function, which serves as the handler for the 'swipe' tool. It performs scrolling/swiping on a container element, optionally until a target element is found, by using SwipeOnElement internally.const scrollHandler = async (device: BootedDevice, args: ScrollArgs, progress?: ProgressCallback) => { // Element-specific scrolling const observeScreen = new ObserveScreen(device); const swipe = new SwipeOnElement(device); const observeResult = await observeScreen.execute(); if (!observeResult.viewHierarchy) { throw new ActionableError("Could not get view hierarchy for element scrolling"); } // Find the element by resource ID const element = elementUtils.findElementByResourceId( observeResult.viewHierarchy, args.containerElementId, args.containerElementId, true // partial match ); if (!element) { throw new ActionableError(`Container element not found with ID: ${args.containerElementId}`); } const containerElement = element; if (!args.lookFor) { const duration = elementUtils.getSwipeDurationFromSpeed(args.speed); const result = await swipe.execute( containerElement, elementUtils.getSwipeDirectionForScroll(args.direction), { duration: duration, easing: "accelerateDecelerate", fingers: 1, randomize: false, lift: true, pressure: 1 }, progress ); return createJSONToolResponse({ message: `Scrolled ${args.direction} within element ${args.containerElementId}`, observation: result.observation }); } else if (!args.lookFor.text && !args.lookFor.elementId) { throw new ActionableError("Either text or element id must be specified to look for something in a scrollable list."); } else { let lastObservation = await observeScreen.execute(); if (!lastObservation.viewHierarchy || !lastObservation.screenSize) { throw new Error("Failed to get initial observation for scrolling until visible."); } const direction = args.direction; const maxTime = 120000; // args.lookFor.maxTime ?? 120000; const startTime = Date.now(); let foundElement = null; while (Date.now() - startTime < maxTime) { // Re-observe the screen to get current state lastObservation = await observeScreen.execute(); if (!lastObservation.viewHierarchy) { throw new Error("Lost observation during scroll until visible."); } // Check if target element is now visible if (args.lookFor.text) { foundElement = elementUtils.findElementByText( lastObservation.viewHierarchy, args.lookFor.text, args.containerElementId, // Search within the specific container true, // fuzzy match false // case-sensitive ); } else if (args.lookFor.elementId) { foundElement = elementUtils.findElementByResourceId( lastObservation.viewHierarchy, args.lookFor.elementId, args.containerElementId, // Search within the specific container true // partial match ); } if (foundElement) { logger.info(`Found element after scrolling for ${Date.now() - startTime}ms.`); break; } // Use the specific container element to swipe, not any scrollable element const result = await swipe.execute( containerElement, elementUtils.getSwipeDirectionForScroll(direction), { duration: 600 }, progress ); // Update observation from swipe result if (result.observation && result.observation.viewHierarchy) { lastObservation = result.observation; } else { throw new Error("Lost observation after swipe during scroll until visible."); } } if (!foundElement) { const target = args.lookFor.text ? `text "${args.lookFor.text}"` : `element with id "${args.lookFor.elementId}"`; throw new ActionableError(`${target} not found after scrolling for ${maxTime}ms.`); } const target = args.lookFor.text ? `text "${args.lookFor.text}"` : `element with id "${args.lookFor.elementId}"`; return createJSONToolResponse({ message: `Scrolled until ${target} became visible`, found: !!foundElement, observation: lastObservation }); } };
- The scrollSchema defines the input parameters for the 'swipe' tool, including containerElementId, direction, optional lookFor target, speed, and platform.export const scrollSchema = z.object({ containerElementId: z.string().describe("Element ID to scroll until visible"), direction: z.enum(["up", "down", "left", "right"]).describe("Scroll direction"), lookFor: z.object({ elementId: z.string().optional().describe("ID of the element to look for while scrolling"), text: z.string().optional().describe("Optional text to look for while scrolling"), maxTime: z.number().optional().describe("Maximum amount of time to spend scrolling, (default 10 seconds)") }).optional().describe("What we're searching for while scrolling"), speed: z.enum(["slow", "normal", "fast"]).optional().describe("Scroll speed"), platform: z.enum(["android", "ios"]).describe("Platform of the device") });