input
Perform touch gestures (tap, double-tap, long press, swipe) and text/keyboard inputs to automate Android and iOS devices.
Instructions
Input actions. tap/double_tap/long_press: coords or text/id/label/index. swipe: direction or coords. text: type text. key: press key.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| action | Yes | ||
| x | No | X coordinate | |
| y | No | Y coordinate | |
| x1 | No | Start X (for custom swipe) | |
| y1 | No | Start Y (for custom swipe) | |
| x2 | No | End X (for custom swipe) | |
| y2 | No | End Y (for custom swipe) | |
| text | No | Element text (tap) or text to type (text action) | |
| resourceId | No | Find element by resource ID (Android only) | |
| label | No | iOS only: Accessibility label | |
| index | No | Tap element by index from ui_tree output (Android only) | |
| key | No | Key name: BACK, HOME, ENTER, TAB, DELETE, etc. | |
| direction | No | Swipe direction | |
| duration | No | Duration in ms (long_press default: 1000, swipe default: 300) | |
| interval | No | Delay between taps in ms for double_tap (default: 100) | |
| targetPid | No | Desktop only: PID of target process | |
| hints | No | Return hints about what changed after the action (default: true, set false to disable) | |
| platform | No | Target platform. If not specified, uses the active target. | |
| deviceId | No | Target device ID for multi-device. If omitted, uses active device. |