run_task
Automate device UI tasks by specifying an end goal; the agent handles app launching, screen interactions, and multi-step workflows. Returns a task ID for status polling.
Instructions
Perform UI automation on a device. The AI agent automatically handles app launching, screen interactions (tapping, swiping, typing, navigating), and multi-step workflows.
Use this tool for:
Opening apps and performing tasks: 'open Chrome and search for weather'
In-app interactions: tapping, swiping, typing, navigating
Content engagement: liking, commenting, sharing, following, subscribing
Information gathering: reading screen content, searching, extracting data
Web browsing: visiting URLs, filling forms, clicking links
Messaging: sending messages, replying, forwarding
Do NOT use this tool to only take a screenshot — use take_screenshot instead.
Provide only the end goal in instruction (e.g., 'open WeChat and send hello to Zhang San'), not low-level steps. Returns immediately with a task ID. Use get_task to poll for status and results. After completion, call take_screenshot to verify the result.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| device_id | Yes | The device ID to execute the task on | |
| instruction | Yes | Target description — describe only the end goal (e.g. 'open WeChat and send a message to Zhang San'), not specific operation steps. | |
| model_alias | No | Model alias to use (e.g. 'balanced'). Use list_model_aliases to see available options. | |
| max_steps | No | Maximum number of AI decision steps (default: 30). Increase for complex multi-step tasks. |