input
Perform tap, double-tap, long-press, swipe, type text, or press keys on Android, iOS, desktop, or browser using coordinates, element text, resource IDs, or accessibility labels.
Instructions
Input actions. tap/double_tap/long_press: coords or text/id/label/index. swipe: direction or coords. text: type text. key: press key.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| action | Yes | ||
| x | No | X coordinate | |
| y | No | Y coordinate | |
| x1 | No | Start X (for custom swipe) | |
| y1 | No | Start Y (for custom swipe) | |
| x2 | No | End X (for custom swipe) | |
| y2 | No | End Y (for custom swipe) | |
| text | No | Element text (tap) or text to type (text action) | |
| resourceId | No | Find element by resource ID (Android only) | |
| label | No | iOS only: Accessibility label | |
| index | No | Tap element by index from ui_tree output (Android only) | |
| key | No | Key name: BACK, HOME, ENTER, TAB, DELETE, etc. | |
| direction | No | Swipe direction | |
| duration | No | Duration in ms (long_press default: 1000, swipe default: 300) | |
| interval | No | Delay between taps in ms for double_tap (default: 100) | |
| targetPid | No | Desktop only: PID of target process | |
| hints | No | Return hints about what changed after the action (default: true, set false to disable) | |
| platform | No | Target platform. If not specified, uses the active target. |