Claude Mobile

FL001.md•9.23 KiB

# FL001: Feature List — Improvements

Status: complete (11/12 shipped, 1 deferred)
Created: 2026-02-09
Context: Full project audit by kotlin-diagnostics + Plan agents, 12 ideas total

---

## Shipped in v2.10.0

- ~~6. Permission Management~~ -> `grant_permission`, `revoke_permission`, `reset_permissions`
- ~~11. Unit Tests~~ -> vitest, 73 tests (ui-parser + image)
- ~~12. Annotate Screenshot~~ -> `annotate_screenshot` with bounding boxes + numbered labels

## Shipped in v2.11.0

- ~~1. waitForElement~~ -> `wait_for_element` with polling + timeout
- ~~2. Async ADB/simctl~~ -> `execAsync`, `execRawAsync`, `getUiHierarchyAsync`, `screenshotRawAsync` (non-blocking hot paths)
- ~~3. Batch Commands~~ -> `batch_commands` tool for multi-step automation in single round-trip
- ~~4. Assertions~~ -> `assert_visible`, `assert_not_exists` tools
- ~~7. Fix iOS longPress + Swipe~~ -> proper WDA Actions API for longPress, dynamic screen center for swipe
- ~~8. Structured Error Classification~~ -> `src/errors.ts` with typed errors (`DeviceNotFoundError`, `AdbNotInstalledError`, etc.)
- ~~9. Race Condition Fix~~ -> WDA launch mutex via `launchPromises` map, per-platform `cachedElementsMap`
- ~~10. WebView Inspection~~ -> `get_webview` tool via Chrome DevTools Protocol

## Deferred

- 5. Screen Recording (video/GIF) — deferred by user decision

---

## 1. waitForElement / Smart Retry Logic

**Priority:** P0 (critical)
**Effort:** ~2-3h
**Files:** `src/index.ts`, `src/device-manager.ts`, `src/adb/client.ts`

New tool `wait_for_element` with polling + timeout. Currently the only way to wait is manual `wait(ms)` which is unreliable and slow.

**Why it matters:**
- #1 cause of flaky tests in mobile automation
- Appium has implicit/explicit waits, Maestro has auto-retries, Detox has idle sync — we have nothing
- Every user testing an app will hit "element not found" due to animations/loading
- ~100 lines of code, massive reliability improvement

**Implementation sketch:**
```typescript
// New tool: wait_for_element
// Params: text/resourceId/className, timeout (default 5000ms), interval (default 500ms)
// Logic: poll getUiHierarchy + findByText/findByResourceId until match or timeout
// Return: found element or timeout error
```

Optional: add `waitBefore` parameter to `tap`/`find_element` for inline waiting.

---

## 2. Async ADB/simctl instead of execSync

**Priority:** P0 (critical)
**Effort:** ~3-4h
**Files:** `src/adb/client.ts`, `src/ios/client.ts`, `src/aurora/client.ts`, `src/device-manager.ts`

Replace `execSync` with `execFile`/`spawn` promises in ADB and simctl clients.

**Why it matters:**
- Every ADB call blocks the Node.js event loop: `screencap` (500ms-2s), `uiautomator dump` (1-3s)
- During this time the MCP server is dead — no parallel request handling
- Claude Code often makes parallel tool calls; second call waits for the first to finish pointlessly
- `AdbClient` already has `execAsync` method — it's just unused
- `uiautomator dump /sdcard/ui.xml` + `cat /sdcard/ui.xml` = 2 sequential calls; can be replaced with `exec-out uiautomator dump /dev/fd/1` for 2x speed

**Risk:** Need to audit all callers for sync assumptions. Tests help (73 already exist).

---

## 3. Batch Commands API

**Priority:** P1 (high)
**Effort:** ~2h
**Files:** `src/index.ts`

New tool `batch_commands` — array of commands executed in a single MCP round-trip.

**Why it matters:**
- Each MCP tool call = full round-trip through stdio (200-500ms overhead)
- Typical scenario "launch -> wait -> type -> tap" = 4 round-trips = 1-2s pure overhead
- With batch: one round-trip, commands execute sequentially on server, array of results returned
- Maestro does this with declarative YAML flows; we offer the same via MCP
- Claude can generate batch arrays, reducing tool calls and speeding up automation 2-4x

**Implementation sketch:**
```typescript
{
  name: "batch_commands",
  inputSchema: {
    commands: [{ name: "tap", arguments: { x: 100, y: 200 } }, ...],
    stopOnError: { type: "boolean", default: true }
  }
}
```

---

## 4. Assertions / Element Verification

**Priority:** P1 (high)
**Effort:** ~2h
**Files:** `src/index.ts`, `src/adb/ui-parser.ts`

Tools: `assert_visible`, `assert_text`, `assert_not_exists` — UI state checks with pass/fail responses.

**Why it matters:**
- Currently Claude verifies UI via `screenshot` -> visual analysis (~1000 tokens per screenshot)
- `assert_visible("Login button")` -> `{ passed: true, element: {...} }` — 0 tokens for screenshot, instant
- All competitors have assertions: Appium, Maestro, Detox
- Without assertions: interaction tool. With assertions: test framework. Different market entirely
- Implementation: wrapper over `findByText` + `getUiHierarchy` with boolean response

---

## 5. Screen Recording (video/GIF)

**Priority:** P2 (medium)
**Effort:** ~3h
**Files:** `src/adb/client.ts`, `src/ios/client.ts`, `src/device-manager.ts`, `src/index.ts`

Tools: `start_recording` / `stop_recording` -> returns video/GIF file.

**Why it matters:**
- Android already supports `adb shell screenrecord`, iOS has `simctl io recordVideo`
- On automation failure, user sees only text "tap failed" — with recording they see what actually happened
- chrome-extension (claude-in-chrome) already has `gif_creator` — mobile part doesn't. Parity feature
- CI/CD: test recording -> pipeline artifact -> visual verification. Without this, CI is incomplete
- Appium and Maestro both support recording — must-have for serious tooling

---

## 7. Fix iOS longPress (bug) + Hardcoded Swipe Coordinates

**Priority:** P0 (critical bug)
**Effort:** ~30min
**Files:** `src/device-manager.ts` (line ~341), `src/ios/client.ts` (lines 179-180)

**Bugs:**
1. `longPress` on iOS calls `client.tap(x, y)` instead of proper long press via WDA Actions API (`pointerDown` + `pause` + `pointerUp`)
2. Swipe uses hardcoded `centerX=200, centerY=400` — on iPad Pro the swipe starts from top-left corner instead of center

**Fix for longPress:**
```typescript
// Use WDA Actions API:
// pointerDown at (x,y) -> pause(duration) -> pointerUp
```

**Fix for swipe:**
```typescript
// Get screen size via WDA or simctl, use actual center:
const size = await this.getScreenSize();
const centerX = size.width / 2;
const centerY = size.height / 2;
```

---

## 8. Structured Error Classification

**Priority:** P1 (high)
**Effort:** ~2-3h
**Files:** `src/adb/client.ts`, `src/ios/client.ts`, `src/aurora/client.ts`, new `src/errors.ts`

Replace generic `Error("ADB command failed")` with typed errors: `DeviceNotFoundError`, `AdbNotInstalledError`, `PermissionDeniedError`, `DeviceOfflineError`.

**Why it matters:**
- Currently any ADB error shows `ADB command failed: ...stderr dump...` — Claude can't distinguish "device disconnected" from "ADB not installed"
- With typed errors, Claude can auto-suggest fixes: "Device offline — try `adb reconnect`"
- Enables retry logic (#2): retry makes sense for `DeviceOfflineError` but not for `AdbNotInstalledError`
- Aurora client silently returns `[]` when `audb` isn't installed — user thinks "no devices" instead of "install audb"
- DX metric: time from error to resolution drops from minutes to seconds

---

## 9. Race Condition Fix in WDA Manager + Global cachedElements

**Priority:** P1 (high)
**Effort:** ~1-2h
**Files:** `src/ios/wda/wda-manager.ts`, `src/index.ts` (line ~665)

**Bugs:**
1. Two parallel iOS requests = two parallel `xcodebuild` launches = port conflict + double 2-min WDA build
2. `cachedElements` is a global variable. If Claude does `get_ui` on Android then `tap(index=3)` on iOS, it taps an Android UI element. Data corruption.

**Fix:**
```typescript
// WDA: Map<deviceId, Promise<WDAClient>> as launch mutex
private launchPromises = new Map<string, Promise<WDAClient>>();

// Cache: Map<deviceId, UiElement[]> instead of global variable
private elementCaches = new Map<string, UiElement[]>();
```

Without this fix, multi-device scenarios are impossible.

---

## 10. WebView Inspection

**Priority:** P2 (medium)
**Effort:** ~4-5h
**Files:** `src/adb/client.ts`, `src/device-manager.ts`, `src/index.ts`, new `src/adb/webview.ts`

Switch context into WebView via Chrome DevTools Protocol (ADB port-forward).

**Why it matters:**
- 80%+ of mobile apps use WebView (OAuth login, in-app browser, hybrid apps)
- Currently WebView is a black box for claude-in-mobile: `get_ui` returns a single "WebView" element without inner structure
- Android supports `adb forward tcp:9222 localabstract:chrome_devtools_remote` -> access via CDP
- Appium solves this with context switching (`NATIVE_APP` <-> `WEBVIEW_com.app`). Maestro doesn't — this is their weak spot
- If we implement WebView inspection, we beat Maestro on a key scenario

**Implementation:**
1. `adb forward` to expose CDP
2. HTTP requests to CDP endpoint for DOM inspection
3. New tool `get_webview` or parameter on `get_ui`
4. ~200 lines

---

## Suggested Implementation Order

| Phase | Items | Rationale |
|-------|-------|-----------|
| Phase 1 (quick wins) | #7 iOS bugs, #9 race conditions | Bug fixes, <2h total |
| Phase 2 (reliability) | #1 waitForElement, #8 error classification | Core reliability |
| Phase 3 (performance) | #2 async ADB, #3 batch commands | Speed improvements |
| Phase 4 (capabilities) | #4 assertions, #5 screen recording | New capabilities |
| Phase 5 (advanced) | #10 WebView inspection | Competitive advantage |

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AlexGladkov/claude-in-mobile'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

FL001.md•9.23 KiB

# FL001: Feature List — Improvements

Status: complete (11/12 shipped, 1 deferred)
Created: 2026-02-09
Context: Full project audit by kotlin-diagnostics + Plan agents, 12 ideas total

---

## Shipped in v2.10.0

- ~~6. Permission Management~~ -> `grant_permission`, `revoke_permission`, `reset_permissions`
- ~~11. Unit Tests~~ -> vitest, 73 tests (ui-parser + image)
- ~~12. Annotate Screenshot~~ -> `annotate_screenshot` with bounding boxes + numbered labels

## Shipped in v2.11.0

- ~~1. waitForElement~~ -> `wait_for_element` with polling + timeout
- ~~2. Async ADB/simctl~~ -> `execAsync`, `execRawAsync`, `getUiHierarchyAsync`, `screenshotRawAsync` (non-blocking hot paths)
- ~~3. Batch Commands~~ -> `batch_commands` tool for multi-step automation in single round-trip
- ~~4. Assertions~~ -> `assert_visible`, `assert_not_exists` tools
- ~~7. Fix iOS longPress + Swipe~~ -> proper WDA Actions API for longPress, dynamic screen center for swipe
- ~~8. Structured Error Classification~~ -> `src/errors.ts` with typed errors (`DeviceNotFoundError`, `AdbNotInstalledError`, etc.)
- ~~9. Race Condition Fix~~ -> WDA launch mutex via `launchPromises` map, per-platform `cachedElementsMap`
- ~~10. WebView Inspection~~ -> `get_webview` tool via Chrome DevTools Protocol

## Deferred

- 5. Screen Recording (video/GIF) — deferred by user decision

---

## 1. waitForElement / Smart Retry Logic

**Priority:** P0 (critical)
**Effort:** ~2-3h
**Files:** `src/index.ts`, `src/device-manager.ts`, `src/adb/client.ts`

New tool `wait_for_element` with polling + timeout. Currently the only way to wait is manual `wait(ms)` which is unreliable and slow.

**Why it matters:**
- #1 cause of flaky tests in mobile automation
- Appium has implicit/explicit waits, Maestro has auto-retries, Detox has idle sync — we have nothing
- Every user testing an app will hit "element not found" due to animations/loading
- ~100 lines of code, massive reliability improvement

**Implementation sketch:**
```typescript
// New tool: wait_for_element
// Params: text/resourceId/className, timeout (default 5000ms), interval (default 500ms)
// Logic: poll getUiHierarchy + findByText/findByResourceId until match or timeout
// Return: found element or timeout error
```

Optional: add `waitBefore` parameter to `tap`/`find_element` for inline waiting.

---

## 2. Async ADB/simctl instead of execSync

**Priority:** P0 (critical)
**Effort:** ~3-4h
**Files:** `src/adb/client.ts`, `src/ios/client.ts`, `src/aurora/client.ts`, `src/device-manager.ts`

Replace `execSync` with `execFile`/`spawn` promises in ADB and simctl clients.

**Why it matters:**
- Every ADB call blocks the Node.js event loop: `screencap` (500ms-2s), `uiautomator dump` (1-3s)
- During this time the MCP server is dead — no parallel request handling
- Claude Code often makes parallel tool calls; second call waits for the first to finish pointlessly
- `AdbClient` already has `execAsync` method — it's just unused
- `uiautomator dump /sdcard/ui.xml` + `cat /sdcard/ui.xml` = 2 sequential calls; can be replaced with `exec-out uiautomator dump /dev/fd/1` for 2x speed

**Risk:** Need to audit all callers for sync assumptions. Tests help (73 already exist).

---

## 3. Batch Commands API

**Priority:** P1 (high)
**Effort:** ~2h
**Files:** `src/index.ts`

New tool `batch_commands` — array of commands executed in a single MCP round-trip.

**Why it matters:**
- Each MCP tool call = full round-trip through stdio (200-500ms overhead)
- Typical scenario "launch -> wait -> type -> tap" = 4 round-trips = 1-2s pure overhead
- With batch: one round-trip, commands execute sequentially on server, array of results returned
- Maestro does this with declarative YAML flows; we offer the same via MCP
- Claude can generate batch arrays, reducing tool calls and speeding up automation 2-4x

**Implementation sketch:**
```typescript
{
  name: "batch_commands",
  inputSchema: {
    commands: [{ name: "tap", arguments: { x: 100, y: 200 } }, ...],
    stopOnError: { type: "boolean", default: true }
  }
}
```

---

## 4. Assertions / Element Verification

**Priority:** P1 (high)
**Effort:** ~2h
**Files:** `src/index.ts`, `src/adb/ui-parser.ts`

Tools: `assert_visible`, `assert_text`, `assert_not_exists` — UI state checks with pass/fail responses.

**Why it matters:**
- Currently Claude verifies UI via `screenshot` -> visual analysis (~1000 tokens per screenshot)
- `assert_visible("Login button")` -> `{ passed: true, element: {...} }` — 0 tokens for screenshot, instant
- All competitors have assertions: Appium, Maestro, Detox
- Without assertions: interaction tool. With assertions: test framework. Different market entirely
- Implementation: wrapper over `findByText` + `getUiHierarchy` with boolean response

---

## 5. Screen Recording (video/GIF)

**Priority:** P2 (medium)
**Effort:** ~3h
**Files:** `src/adb/client.ts`, `src/ios/client.ts`, `src/device-manager.ts`, `src/index.ts`

Tools: `start_recording` / `stop_recording` -> returns video/GIF file.

**Why it matters:**
- Android already supports `adb shell screenrecord`, iOS has `simctl io recordVideo`
- On automation failure, user sees only text "tap failed" — with recording they see what actually happened
- chrome-extension (claude-in-chrome) already has `gif_creator` — mobile part doesn't. Parity feature
- CI/CD: test recording -> pipeline artifact -> visual verification. Without this, CI is incomplete
- Appium and Maestro both support recording — must-have for serious tooling

---

## 7. Fix iOS longPress (bug) + Hardcoded Swipe Coordinates

**Priority:** P0 (critical bug)
**Effort:** ~30min
**Files:** `src/device-manager.ts` (line ~341), `src/ios/client.ts` (lines 179-180)

**Bugs:**
1. `longPress` on iOS calls `client.tap(x, y)` instead of proper long press via WDA Actions API (`pointerDown` + `pause` + `pointerUp`)
2. Swipe uses hardcoded `centerX=200, centerY=400` — on iPad Pro the swipe starts from top-left corner instead of center

**Fix for longPress:**
```typescript
// Use WDA Actions API:
// pointerDown at (x,y) -> pause(duration) -> pointerUp
```

**Fix for swipe:**
```typescript
// Get screen size via WDA or simctl, use actual center:
const size = await this.getScreenSize();
const centerX = size.width / 2;
const centerY = size.height / 2;
```

---

## 8. Structured Error Classification

**Priority:** P1 (high)
**Effort:** ~2-3h
**Files:** `src/adb/client.ts`, `src/ios/client.ts`, `src/aurora/client.ts`, new `src/errors.ts`

Replace generic `Error("ADB command failed")` with typed errors: `DeviceNotFoundError`, `AdbNotInstalledError`, `PermissionDeniedError`, `DeviceOfflineError`.

**Why it matters:**
- Currently any ADB error shows `ADB command failed: ...stderr dump...` — Claude can't distinguish "device disconnected" from "ADB not installed"
- With typed errors, Claude can auto-suggest fixes: "Device offline — try `adb reconnect`"
- Enables retry logic (#2): retry makes sense for `DeviceOfflineError` but not for `AdbNotInstalledError`
- Aurora client silently returns `[]` when `audb` isn't installed — user thinks "no devices" instead of "install audb"
- DX metric: time from error to resolution drops from minutes to seconds

---

## 9. Race Condition Fix in WDA Manager + Global cachedElements

**Priority:** P1 (high)
**Effort:** ~1-2h
**Files:** `src/ios/wda/wda-manager.ts`, `src/index.ts` (line ~665)

**Bugs:**
1. Two parallel iOS requests = two parallel `xcodebuild` launches = port conflict + double 2-min WDA build
2. `cachedElements` is a global variable. If Claude does `get_ui` on Android then `tap(index=3)` on iOS, it taps an Android UI element. Data corruption.

**Fix:**
```typescript
// WDA: Map<deviceId, Promise<WDAClient>> as launch mutex
private launchPromises = new Map<string, Promise<WDAClient>>();

// Cache: Map<deviceId, UiElement[]> instead of global variable
private elementCaches = new Map<string, UiElement[]>();
```

Without this fix, multi-device scenarios are impossible.

---

## 10. WebView Inspection

**Priority:** P2 (medium)
**Effort:** ~4-5h
**Files:** `src/adb/client.ts`, `src/device-manager.ts`, `src/index.ts`, new `src/adb/webview.ts`

Switch context into WebView via Chrome DevTools Protocol (ADB port-forward).

**Why it matters:**
- 80%+ of mobile apps use WebView (OAuth login, in-app browser, hybrid apps)
- Currently WebView is a black box for claude-in-mobile: `get_ui` returns a single "WebView" element without inner structure
- Android supports `adb forward tcp:9222 localabstract:chrome_devtools_remote` -> access via CDP
- Appium solves this with context switching (`NATIVE_APP` <-> `WEBVIEW_com.app`). Maestro doesn't — this is their weak spot
- If we implement WebView inspection, we beat Maestro on a key scenario

**Implementation:**
1. `adb forward` to expose CDP
2. HTTP requests to CDP endpoint for DOM inspection
3. New tool `get_webview` or parameter on `get_ui`
4. ~200 lines

---

## Suggested Implementation Order

| Phase | Items | Rationale |
|-------|-------|-----------|
| Phase 1 (quick wins) | #7 iOS bugs, #9 race conditions | Bug fixes, <2h total |
| Phase 2 (reliability) | #1 waitForElement, #8 error classification | Core reliability |
| Phase 3 (performance) | #2 async ADB, #3 batch commands | Speed improvements |
| Phase 4 (capabilities) | #4 assertions, #5 screen recording | New capabilities |
| Phase 5 (advanced) | #10 WebView inspection | Competitive advantage |