Skip to main content
Glama

podium-mcp

One baton. Every instrument.

A single MCP stdio endpoint with 43 tools for iOS-simulator control, native UI automation, end-to-end flows, trustworthy assertions, React Native debugging, and WebView DOM + network inspection — one connection instead of half a dozen servers.

License: MIT Node TypeScript MCP Tools Tests CI mcp.so

One prompt → podium drives Safari live → types the URL → explores the profile → opens a repo. Footage captured on a live iPhone 16 Pro simulator.


A podium is where a maestro stands — one place to conduct the whole orchestra. This MCP server unifies six capability sets behind a single stdio endpoint:

  • Device & app managementsimctl, with graceful adb detection.

  • Native UI inspection & gestures — route through idb/mobilecli with a Maestro fallback (no per-gesture JVM spin-up).

  • End-to-end flows & batch automation — declarative Maestro flows, ordered action batches, and an engineer→QA flow exporter.

  • Trustworthy assertions — an oracle ladder (WebView-DOM › native a11y › Maestro) that returns falsifiable, evidenced verdicts and fails closed.

  • WebView DOM + network — resolve WKWebView DOM to tap coordinates, evaluate JS, drive navigation, and capture in-page HTTP traffic as JSON/HAR.

  • React Native debugging — Metro console logs, network requests, and in-app state over CDP, plus host/simulator crash reports.

Rather than wiring several MCP servers into every client config, podium-mcp exposes everything behind one connection, with a shared execFile layer (no shell), consistent structured errors, automatic retry around Maestro's iOS-driver flakiness, and a single health-check tool to confirm what's available on the host.

What's new in v0.2.0

  • Oracle ladder + trustworthy verdictsassert_visible / assert_text / assert_not_visible / wait_for_element and validate_flow verify state through WebView-DOM › native a11y › Maestro, returning evidenced results that fail closed instead of guessing "looks ok".

  • Batch & exportrun_steps runs an ordered action batch in one call via the native backend; export_flow turns that batch into a reusable Maestro flow (the engineer→QA bridge).

  • WebView network capturewebview_network records in-WebView fetch/XHR traffic and exports redacted JSON or HAR 1.2 (the network path metro_network can't see for WebView-hosted apps).

  • Deeper RN introspectionmetro_network (CDP Network domain) and metro_state (read the in-app Redux store) join metro_logs.

  • Native-first gesture backendidb/mobilecli cut tap_on ~14.7 s → ~0.6 s and inspect_screen ~8.9 s → ~0.9 s, with a Maestro fallback that preserves app state.

  • Reliability hardening — explicit per-command timeouts with a timedOut flag, timestamped recordings + a duration watchdog, native-backend re-probe TTL, exact bundle-id matching, and transparent iOS-simulator scope.

  • Registry-readyserver.json manifest + OIDC publish to the official MCP Registry, test-gated before every publish.

Related MCP server: mobile-device-mcp

Table of contents

Why

Driving a React Native app end-to-end usually means juggling several MCP servers — one for device/app control, one for UI flows, one for Metro/debugger logs, another for WebView inspection — each with its own config entry, quirks, and failure modes. podium-mcp collapses that into one server with:

  • a single execFile-based command runner (no shell — arguments are passed verbatim),

  • consistent structured errors (a tool never crashes the server),

  • automatic retry around Maestro's known iOS-driver flakiness,

  • graceful degradation when a toolchain (e.g. adb) is absent,

  • evidenced verdicts so an agent knows when a flow actually worked.

Requirements

  • macOS with Xcode command-line tools (xcrun, simctl)

  • Node.js ≥ 22 (uses native fetch and WebSocket; .npmrc sets engine-strict=true)

  • mobilecli — bundled automatically as an npm dependency; the default native gesture + WebView backend (no separate install)

  • (optional) idb (idb + idb_companion) — preferred native gesture backend when both are present; auto-detected

  • (optional) Maestro on PATH (or at ~/.maestro/bin) — the run_flow engine and the gesture fallback path

  • (optional) a running Metro bundler for the metro_* debugging tools

  • (optional) Android SDK + adb — adb paths are detection-only and degrade gracefully when absent

Platform scope: podium's automation targets the iOS Simulator. Android devices are detected (device_list, podium_health) but not yet automatable — every adb-backed path returns an informative result instead of failing.

Install

No manual config — one-time marketplace setup, then install:

/plugin marketplace add github:hoainho/podium-mcp
/plugin install podium-mcp@podium

The plugin auto-starts the MCP server (all 43 tools) and ships four skills:

Skill

Invoke

What it does

Device info

/podium-mcp:device-info <UDID> [<BUNDLE_ID>]

Health check, screen size, orientation, app list

E2E flow

/podium-mcp:e2e <UDID> <BUNDLE_ID> [path or description]

Run or author a Maestro flow

Bug repro

/podium-mcp:bug-repro <UDID> <BUNDLE_ID> <description>

Video + logs + crash evidence capture

RN debug

/podium-mcp:rn-debug [UDID] [logs|apps|crash|all]

Metro logs, connected apps, crash reports

npx (zero install)

{
  "mcpServers": {
    "podium": { "command": "npx", "args": ["-y", "podium-mcp"] }
  }
}

Manual (from source)

git clone git@github.com:hoainho/podium-mcp.git
cd podium-mcp
npm install
npm run build

Usage

Register the built server with any MCP client. Claude Code (.mcp.json):

{
  "mcpServers": {
    "podium": {
      "type": "stdio",
      "command": "node",
      "args": ["/absolute/path/to/podium-mcp/dist/index.js"]
    }
  }
}

Quick manual smoke test over raw stdio (lists the 43 registered tools):

printf '%s\n' \
  '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"0"}}}' \
  '{"jsonrpc":"2.0","method":"notifications/initialized"}' \
  '{"jsonrpc":"2.0","id":2,"method":"tools/list"}' | node dist/index.js

Always call podium_health first to confirm which toolchain is available on the host.

Quick start (order of use)

  1. podium_health — confirm xcrun / maestro / native backend availability.

  2. device_list — pick a booted simulator udid.

  3. Read stateapp_list, app_state, screen_size, orientation_get.

  4. Drive the deviceapp_launch, then tap_on / input_text / swipe / press_key, plus set_location and orientation_set. Batch several with run_steps.

  5. Author & verifyinspect_screen to discover elements, run_flow for declarative checks, then assert_visible / validate_flow for an evidenced verdict.

  6. Inspect WebViewswebview_inspect → tap coordinates, webview_eval, webview_navigate, webview_network.

  7. Capture & debugscreenshot / record_startrecord_stop; metro_logs / metro_network / metro_state; crash_list / crash_get.

The 43 tools

Every tool returns structured JSON and never throws — failures come back as MCP tool errors. See docs/tool-catalog.md for the authoritative per-parameter reference.

Health & toolchain (1)

Tool

Key params

Backing engine

Behavior

podium_health

which probes

Never fails; reports toolchain { xcrun, maestro, adb }, native backend, and platform: ios-simulator

Device & simulator (6)

Tool

Key params

Backing engine

Behavior

device_list

simctl list -j + adb devices

Merged iOS inventory; adb absent → android: { available: false } (detection-only)

device_boot

udid

simctl boot

Idempotent — already-booted → alreadyBooted: true; waits up to 30 s

screen_size

udid

simctl io screenshot + sips

{ widthPx, heightPx } (real pixels)

orientation_get

udid

native query → screenshot heuristic

{ orientation, basis } (exact when native)

set_location

udid, latitude, longitude

simctl location set

Codifies the QA geo-spinner fix

open_url

udid, url

simctl openurl

Deep links + https://

Apps (6)

Tool

Key params

Backing engine

Behavior

app_install

udid, path (.app/.zip)

simctl install

Structured tool error

app_launch

udid, bundleId

simctl launch

Explicit 30 s timeout (cold RN launches no longer mis-report failure)

app_terminate

udid, bundleId

simctl terminate

Structured tool error

app_uninstall

udid, bundleId

simctl uninstall

Structured tool error

app_list

udid

simctl listapps + plutil

{ count, apps: [{ bundleId, name, type }] }

app_state

udid, bundleId

simctl listapps + launchctl

{ installed, running }exact bundle-id match

Capture (3)

Tool

Key params

Backing engine

Behavior

screenshot

udid, saveTo?

simctl io screenshot

Returns path + byteSize (no base64 bloat)

record_start

udid, saveTo? (.mp4)

detached simctl io recordVideo

{ ok, path, pid }; timestamped path + duration watchdog (PODIUM_MAX_RECORDING_MS); one per udid

record_stop

udid

SIGINT recorder + flush

{ ok, path, sizeBytes }

UI inspection & gestures (8)

Tool

Key params

Backing engine

Behavior

inspect_screen

udid, compact?

native flat AX list → maestro hierarchy

compact:true (default) returns only meaningful nodes

tap_on

udid, bundleId, text|id|x+y, double?, long?

native tap → Maestro fallback

text/id resolved via the element list; reports backend

input_text

udid, bundleId, text, submit?

native → Maestro fallback

reports backend

swipe

udid, bundleId, direction, start/end?

native → Maestro fallback

%/pixel overrides resolved vs logical screen size

press_key

udid, bundleId, key

native → Maestro fallback

back/power/tab are Android-only

orientation_set

udid, bundleId, value

native → Maestro fallback

PORTRAIT / LANDSCAPE_LEFT / LANDSCAPE_RIGHT / UPSIDE_DOWN

tap_with_fallback

udid, x, y, maxRetries?, offsetStep?

native tap + before/after oracle

For WebGL/Canvas overlays; no blind walk (offsetStep opt-in)

notification_bar_clear

udid, bundleId?

native tap + oracle

Dismisses the RN debug notification bar

Flows & batch automation (4)

Tool

Key params

Backing engine

Behavior

run_steps

udid, bundleId, steps[]

native backend (idb/mobilecli)

Ordered action batch in one call; per-step results

run_flow

udid + exactly one of yaml/files/dir(+tags), env?

maestro test

Exactly-one-of validated before exec; per-step pass/fail

export_flow

steps[], output path

flow generator

Exports a run_steps batch to a reusable Maestro flow (engineer→QA bridge)

cheat_sheet

bundled assets/maestro-cheat-sheet.yaml

Fully offline Maestro syntax reference

Assertions & verdicts — the oracle ladder (5)

Tool

Key params

Backing engine

Behavior

assert_visible

udid, text|id, …

oracle ladder (WebView-DOM › a11y › Maestro)

Evidenced pass/fail; reports which oracle proved it

assert_text

udid, text

oracle ladder

by-text shorthand for assert_visible

assert_not_visible

udid, text|id

oracle ladder

Fails closed — if absence can't be verified, it fails

wait_for_element

udid, text|id, timeoutMs?

oracle ladder (polling)

Polls until visible or times out

validate_flow

udid, flow + assertions

oracle ladder + flow run

Trustworthy, falsifiable verdict on whether a just-built flow works

WebView DOM & network (4)

Tool

Key params

Backing engine

Behavior

webview_inspect

udid, selector?, webviewId?, max?

mobilecli (CDP)

Resolves a CSS selector to DOM elements with absolute tapX/tapY

webview_eval

udid, expression, webviewId?

mobilecli (CDP)

Runs JS in the page context; gated by PODIUM_DISABLE_WEBVIEW_EVAL=1

webview_navigate

udid, action (goto/back/forward/reload), url?

mobilecli (CDP)

Drives WebView navigation

webview_network

udid, durationMs?, format (json/har)?, saveTo?, redact?, includeResources?

CDP + in-page fetch/XHR shim + Resource Timing

Captures in-WebView HTTP traffic; exports redacted JSON or HAR 1.2

React Native debugging — Metro CDP (4)

Tool

Key params

Backing engine

Behavior

metro_apps

port? (8081)

GET http://localhost:<port>/json

Differentiated errors (timeout vs not-running vs other)

metro_logs

wsUrl?/port?, durationMs?, maxLogs?

WebSocket + CDP Runtime.enable

Auto-discovers first app when URL omitted

metro_network

wsUrl?/port?, durationMs?, maxEntries?

CDP Network.enable

Requests (url/method/status/mimeType/ts)

metro_state

expression?/wsUrl?/port?, timeoutMs?

CDP Runtime.evaluate

Reads in-app state (default: globally-exposed Redux store)

Crash diagnostics (2)

Tool

Key params

Backing engine

Behavior

crash_list

processName?, sinceHours?, udid?

host + sim DiagnosticReports

Newest-first; tagged source: host | simulator

crash_get

id, udid?

same

Path-traversal-safe (basename only); truncates honestly

The oracle ladder — trustworthy assertions

"It works" is operationalized as a falsifiable, evidenced verdict — never "looks ok". Assertions and validate_flow resolve visibility through a three-rung ladder, using the strongest available signal:

  1. WebView DOM — when an inspectable WKWebView is present, query the real DOM.

  2. Native accessibility — the native AX element set (via idb/mobilecli).

  3. MaestroassertVisible/assertNotVisible as the fallback.

assert_not_visible fails closed: if absence can't be positively verified (e.g. a WebView is unreadable), it reports failure rather than a false pass. Every verdict names the oracle that produced it, so an agent can weight its confidence.

Native-first gesture backend

Imperative gestures (tap_on, input_text, swipe, press_key, orientation_set, run_steps) and inspect_screen route through the fastest available backend, probed once and cached (with a short negative-cache TTL so a backend that starts after launch is picked up):

  1. idb — when both idb and idb_companion are installed (native, fastest).

  2. mobilecli — the bundled npm dependency (prebuilt Go binary). Default; no install.

  3. Maestro fallback — when no native backend resolves, or for actions it can't express (double/long-press, UPSIDE_DOWN). The gesture generates a minimal flow with launchApp: { stopApp: false }, foregrounding the app without restarting so state is preserved.

Each result reports the backend it used. Set PODIUM_DISABLE_NATIVE=1 to force Maestro. Eliminating the per-gesture JVM spin-up cut tap_on ~14.7 s → ~0.6 s and inspect_screen ~8.9 s → ~0.9 s on an iPhone 16 Pro simulator. Run npm run benchmark for a full pass/fail sweep.

Maestro flakiness retry: when the fallback runs, its iOS driver intermittently fails with Failed to connect to 127.0.0.1:<port>. Flows retry up to 2× with 2 s / 5 s backoff and report the retries count; a persistent failure returns the raw output with remediation hints.

WebView & RN network introspection

Two distinct network layers, two tools:

  • metro_network captures requests on the RN/Hermes target via the CDP Network domain — the right tool for a native RN app's own fetch.

  • webview_network captures traffic inside a WKWebView: it injects a fetch/XHR recorder (rich — method/status/headers/body for calls after capture starts) and reads the browser's Performance Resource Timing buffer (includeResources, default on) — every request since navigation, including pre-capture ones (URL/timing/size). The merge yields a near-complete request list, exported as redacted JSON or HAR 1.2.

For an RN shell that hosts its UI in a WebView, the app's API calls run in the web layer — so metro_network sees nothing and webview_network is the tool to reach for. WebView tools require WKWebView.isInspectable = true (default in debug/staging builds; off in production); when none is found they return an actionable error.

Documented limits (by design, not bugs)

  • WebGL/Canvas content is un-automatable by selector — no DOM/hierarchy; use tap_with_fallback with screenshot-derived coordinates.

  • WebView tools are dev/QA only — production App Store builds typically set isInspectable = false; tools return an actionable error and fall back to coordinate taps.

  • WebView content-process memory is unreadable from the app sandbox (platform limit) — use indirect signals (memory warnings, process terminations).

  • Maestro text: matcher is full-string regex (IGNORE_CASE) — partial strings don't match; copy hierarchy text verbatim or anchor with .*.

  • Android is detection-only — every adb path degrades to a structured "adb not found" result.

  • orientation_get is a screenshot-aspect heuristic when no native backend is present — iOS simulators expose no direct orientation query.

  • record_start/record_stop keep state in-process — serialize start → … → stop on one connection; one active recording per udid (a watchdog finalizes one that's never stopped).

Architecture

src/
  index.ts          # MCP server entry — registers every tool group, warms caches
  lib/
    exec.ts         # execFile-based runner (NO shell) + timeout/timedOut flag
    result.ts       # shared ok/error MCP content helpers
    simctl.ts       # xcrun simctl wrappers + device-list TTL cache
    native.ts       # gesture/inspect backend: idb → mobilecli → null (re-probe TTL)
    idb.ts          # idb gesture/inspect adapter
    gesture.ts      # unified native→Maestro executors (shared by screen + steps)
    oracle.ts       # the oracle ladder: WebView-DOM › a11y › Maestro
    maestro.ts      # Maestro engine: flow runner, idb retry, hierarchy
    export-maestro.ts # run_steps → reusable Maestro flow
    har.ts          # HAR 1.2 export for webview_network
    webview.ts      # mobilecli CDP — WebView list/inspect/eval/navigate/network
    metro.ts        # Metro CDP — app discovery, logs, network, state
    crash.ts        # DiagnosticReports crash listing/reading
    recording.ts    # detached screen recording lifecycle + watchdog
  tools/            # one file per group:
                    #   health, device, screen, steps, flow,
                    #   assert, validate, webview, debug
assets/             # bundled offline Maestro cheat sheet + demo.gif
scripts/            # benchmark.ts, compare-mcps.ts
e2e/                # real-simulator smoke suites (smoke / full-smoke / webview-network-live)
docs/               # tool catalog, e2e transcript, roadmap

Development & testing

npm run build       # tsc
npm run typecheck   # tsc --noEmit
npm test            # vitest run — 182 unit/integration tests (exec/network mocked, no sim needed)
npm run benchmark   # spawn a fresh server over stdio and sweep the tool suite
node e2e/smoke.e2e.mjs        # real E2E against a booted simulator (macOS + Xcode)
node e2e/full-smoke.e2e.mjs   # drives all 43 tool handlers (happy + structured-error paths)

182 tests across 16 files, all passing — including the oracle ladder (oracle, assert, validate), recording watchdog + timestamps, gesture-parity (screensteps), HAR export, WebView, and Metro paths.

Standards: TypeScript strict, no as any / @ts-ignore, no shell execution (all commands via lib/exec.ts), tools return structured errors instead of throwing. See CONTRIBUTING.md for the "add a new tool" checklist.

E2E on CI: the E2E (simulator) workflow boots a real iOS simulator on a macOS runner and runs the smoke suites nightly + on demand (not a PR gate — simulator runs are slow). full-smoke.e2e.mjs asserts the happy path where a target exists and the real structured-error path where a dependency is absent (a debug isInspectable app for WebView; a connected RN app for metro_*).

Releasing

server.json is the official MCP Registry manifest. Pushing a v* tag runs Publish to npm then Publish to MCP Registry (GitHub OIDC for the io.github.hoainho/* namespace — no long-lived token). Both workflows run typecheck → build → test as a gate first; the registry publish only succeeds once the matching npm version is live, and versions are immutable.

Prompt playbook & references

  • prompts/ — copy-paste prompts for e2e flows, test cases, feature verification, bug fixing, and device control. Each names the podium tools it drives and was validated on a real simulator. Start with prompts/README.md.

  • docs/tool-catalog.md — authoritative tool-by-tool reference.

  • docs/e2e-demo.md — a real transcript against a booted iPhone 16 Pro simulator running a production RN app.

Design ideas

  • One podium, one connection. A single server fronts every mobile capability so an agent configures one endpoint and discovers all 43 tools at once.

  • Safe by construction. Every external command runs through an execFile layer with an explicit argument array — never a shell string.

  • Never crash the conductor. Tools return structured results and errors instead of throwing; one bad call can't take the server down.

  • Degrade, don't fail. A missing toolchain (e.g. Android's adb) yields an informative result rather than a hard error.

  • Prove it, don't guess. Assertions return evidenced verdicts via the oracle ladder and fail closed when they can't verify.

Contributing

Contributions welcome — see CONTRIBUTING.md and the Code of Conduct. Use the issue templates for bugs and feature requests.

Security

Please report vulnerabilities privately per SECURITY.md — do not open a public issue. SECURITY.md also documents the webview_eval / run_flow trust boundary and the PII-in-transcript caveat.

License

MIT © 2026 hoainho

Install Server
A
license - permissive license
A
quality
B
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hoainho/podium-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server