What can you do with this server?

The agent-device server enables AI agents to automate real mobile, TV, and desktop apps through inspection, interaction, evidence capture, and workflow replay across iOS, Android, macOS, tvOS, and Linux platforms. Device & App Management * List devices (devices), boot devices (boot), list/install/reinstall apps (apps, install, reinstall, install-from-source), open apps/deep links/URLs (open), close apps or sessions (close), manage sessions (session), and check foreground app state (appstate). UI Inspection * Capture accessibility snapshots (snapshot), diff snapshots (diff), find elements by text/label/role/id (find), get element text/attributes (get), and assert UI state (is). UI Interaction * Tap/click (click, press), long press (longpress), fill text fields (fill), type into focused fields (type), focus inputs (focus), scroll (scroll), swipe (swipe), run structured gestures like pan/fling/pinch/rotate (gesture), navigate back (back), go home (home), open app switcher (app-switcher), rotate orientation (rotate), handle alerts (alert), manage keyboard (keyboard), read/write clipboard (clipboard), wait for conditions (wait), change OS settings/permissions (settings), push notifications (push), and trigger app-defined events (trigger-app-event). Evidence & Diagnostics * Capture screenshots (screenshot), record video (record), collect logs (logs), monitor network traffic (network), gather performance metrics (perf), and start/stop traces (trace). Workflow Automation * Replay recorded .ad scripts (replay), run multiple scripts with retry and JUnit reporting (test), and batch multiple commands in a single request (batch). React Native / Metro * Prepare Metro runtime, reload React Native apps (metro), and dismiss React Native overlays (react-native). Supports native, Expo, Flutter, and React Native apps on simulators, emulators, and physical devices.

Which integrations are available for this server?

Automates Android emulators and devices, enabling app opening, UI accessibility tree inspection, element interaction (fill, tap), and evidence collection (screenshots, videos, logs, CPU/memory/perf). Enables testing of Expo apps on iOS and Android, including opening Expo Go or dev clients, UI snapshotting, element interaction, and gathering evidence such as screenshots, videos, logs, and performance traces. Automates iOS simulators and devices, enabling app opening, UI accessibility tree inspection, element interaction, and evidence collection (screenshots, videos, logs, network, performance). Automates Linux desktop applications, enabling UI inspection via accessibility snapshots, element interaction, and evidence collection (screenshots, logs, performance). Automates macOS desktop applications, enabling UI inspection via accessibility snapshots, element interaction, and evidence collection (screenshots, logs, performance).

agent-device

Official

by callstack

Overview Schema Related Servers Score Discussions

TypeScript

Remote

agent-device

npm version License: MIT Glama MCP server

Mobile app verification for AI agents.

A device automation CLI for real apps on iOS, Android, TV, web, and desktop. Agents get token-efficient snapshots, semantic refs, and evidence captured only when needed.

agent-device lets coding agents open apps, inspect the current UI, interact with visible elements, and collect debugging evidence through one CLI. Use it when an agent needs to verify what actually happens on a device, not just reason about code.

The CLI is the agent's hands, eyes, and evidence collector; it is not the brain. Your coding agent or QA harness reads the task, interprets the current screen, chooses the next command, and decides whether the result satisfies the test. agent-device keeps that loop grounded in structured accessibility data, deterministic actions, and files an engineer can inspect.

If you know Vercel's agent-browser, agent-device is the same idea for mobile, TV, and desktop apps. Minimal --platform web support reuses agent-browser when a browser session needs to fit into the same command/session/replay loop.

It works with native iOS and Android apps, plus apps built with Expo, Flutter, and React Native, as long as the target can run on a supported device, simulator, emulator, or desktop environment.

agent-device demo showing Codex using agent-device to create a new contact in the iOS Contacts app from a simple prompt

Capabilities

Inspect real app UI through structured accessibility snapshots, interactive refs like @e3, selectors, and React Native component trees.
Interact by opening apps, tapping, typing, scrolling, performing gestures, waiting, asserting state, handling alerts, and closing sessions.
Capture evidence with screenshots, videos, logs, traces, network traffic, audio-level probes for browser and host-rendered simulator/emulator audio, performance samples, crash context, and React profiles.
Replay workflows by recording .ad scripts for local runs, CI, repeatable e2e checks, and strict Maestro YAML export when a flow needs to run in Maestro.
Run across platforms with iOS Simulator automation, Android Emulator automation, physical devices, tvOS, Android TV, macOS, Linux, and desktop app automation, so agents can see and feel the app they work on.

Related MCP server: MCP Workflow CLI Bridge

Use Cases

Verify mobile changes on real devices, simulators, and emulators before review or merge.
Give AI coding agents a real app feedback loop while they implement features.
Debug regressions with screenshots, logs, traces, network/audio evidence, and crash context.
Profile performance issues with CPU/memory samples and React render profiles when needed.
Turn exploratory app interactions into replayable e2e checks for CI.
Use one agent workflow across native iOS, Android, Expo, Flutter, React Native, TV, and desktop apps.

Sketch showing agent-device as the live app verification layer in the agentic development loop

Quick Start

Install the CLI:

npm install -g agent-device@latest
agent-device doctor
agent-device --version
agent-device help workflow

Run agent-device doctor yourself after installation to check local setup before handing the CLI to an agent. The installed CLI help is the source of truth for agents. Start with agent-device help workflow, then follow the topic-specific help when a task needs dogfooding, debugging, replay, or React Native profiling.

Prerequisites depend on the target platform: Node.js 22+, Xcode for iOS/tvOS/macOS targets, Android SDK + ADB for Android, and macOS Accessibility permission for desktop automation. Web automation requires Node 24+. See Installation for platform setup.

Try the basic loop:

# Find an app.
agent-device apps --platform ios
agent-device apps --platform android

# Start a session.
agent-device open SampleApp --platform ios

# Inspect the current screen. -i returns interactive elements only.
agent-device snapshot -i
# @e1 [heading] "Settings"
# @e2 [button] "Sign In"
# @e3 [text-field] "Email"

# Act, capture evidence, and close.
agent-device fill @e3 "test@example.com"
agent-device screenshot ./artifacts/settings.png
agent-device close

Snapshots assign refs like @e1, @e2, and @e3 to elements on the current screen. Refs from the latest snapshot are immediately actionable; after scrolling or changing screens, take a fresh snapshot.

Snapshots come from the app's accessibility tree, so high-quality labels, roles, and test IDs make agent runs far more reliable. Use screenshots and videos as evidence or visual fallback, but prefer refs and selectors for actions and assertions whenever the UI exposes enough structure.

Next Steps

Set up your agent: run the CLI from Cursor, Codex, Claude Code, Windsurf, or another agent terminal. For skills, rules, direct MCP tools, and client-specific setup, see AI Agent Setup.
Try the sample app: clone the repo and run the bundled Expo fixture when you want a guided first dogfood run with screenshots, replay, and performance evidence. See Quick Start.
Go deeper: use Commands, Replay & E2E, and Debugging & Profiling for production workflows.

Articles & Videos

Articles

Videos

Where To Run agent-device

Path	Best for	Start with
Local	Exploration, debugging, and development loops on simulators, emulators, physical devices, macOS apps, and Linux desktop targets.	Follow the Quick Start.
CI/CD	Automated PR and merge validation with replay scripts and captured artifacts.	Try the EAS workflow template. GitHub Actions template coming soon.
Cloud / remote execution	Linux runners, managed devices, and remote execution.	Use Agent Device Cloud, see Commands for remote profiles, or contact Callstack for team-scale QA.

How It Works

agent-device runs session-aware commands through platform backends: XCTest for iOS and tvOS, ADB plus the Android snapshot helper for Android, a local helper for macOS desktop automation, and AT-SPI for Linux desktop targets.

Node consumers can use the typed client and public subpaths for bridge integrations. agent-device/android-adb exposes the Android ADB provider contract, logcat/clipboard/keyboard/app helpers, and port reverse management.

FAQ

What is agent-device?

agent-device is a device automation CLI for AI mobile app testing. It lets AI agents verify real apps on iOS, Android, TV, desktop, simulators, emulators, and physical devices.

Does it work with React Native, Expo, Flutter, and native apps?

Yes. agent-device works with native iOS and Android apps, Expo apps, Flutter apps, React Native apps, TV apps, and desktop apps that run on supported targets.

How is it different from Appium, Detox, or Maestro?

Appium, Detox, and Maestro are traditional mobile automation frameworks. agent-device is optimized for AI agents that need to inspect app state, interact semantically, capture evidence, debug, profile, and turn useful explorations into replayable checks.

Used By

Used by teams and developers at Callstack, JPMorgan Chase, Expensify, Shopify, Kindred, Total Wine & More, LegendList, HerLyfe, App & Flow, and more.

Documentation

Contributing

See CONTRIBUTING.md.

Made at Callstack

agent-device is open source and MIT licensed. Visit agent-device.dev, try the EAS workflow template, read the docs, or contact us at hello@callstack.com.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

1hResponse time

1dRelease cycle

109Releases (12mo)

Commit activity

Issues opened vs closed

Resources

Need Help?

Related Servers

Tools

View all tools

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/callstack/agent-device'

If you have feedback or need assistance with the MCP directory API, please join our Discord server