Skip to main content
Glama

analyze_screen

Analyzes the current mobile screen's view hierarchy, classifies UI elements into forms, CTAs, and tab bars, and outputs candidate test cases for QA automation.

Instructions

Mobile 版的 analyze_url:透過 maestro hierarchy dump 當前 iOS Simulator / Android Emulator / 實體機 / BlueStacks(透過 QA_ANDROID_HOST)前景 app 的 view tree,再分類成 form(具 hint_text 的輸入欄位)、cta(enabled + 有文字的可點元件)、tab_bar(selected 狀態 + 同 y 對齊的 2+ 個 tab)三種 modules 並附 candidate_tcs。內建 noise filter 自動排除 iOS 狀態列 + asset 命名標籤(bg_* / *_filled / 純數字 / 單一 ASCII 字元等)讓結果信號集中。需 Maestro CLI 已裝、裝置 booted、app 已在前景。若給 app_id + launch_app=true,會先用 launchApp 啟動再 dump。

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
app_idNo選填,bundle id (iOS) / package name (Android),格式如 `com.example.app`。搭配 launch_app=true 使用,或為了在輸出標註是分析哪個 app。
launch_appNo搭配 app_id:True 時在 hierarchy dump 前用 maestro launchApp 啟動 app。用 clearState: false(保留 app 狀態),確保看到「真實」起始畫面。省略則假設裝置上 app 已是當前前景。
timeout_msNo選填,hierarchy 命令超時毫秒。預設 30000;BlueStacks / 遠端 ADB 較慢,QA_ANDROID_HOST 有設時會自動拉到 60000 起跳。

Implementation Reference

  • Main handler for analyze_screen: validates Maestro CLI, optionally launches the app, runs `maestro hierarchy`, parses JSON output, walks the tree, and returns classified modules (forms, CTAs, tab_bars) with candidate test cases.
    def analyze_screen(
        app_id: str | None = None,
        launch_app: bool = False,
        timeout_ms: int = 30000,
    ) -> dict[str, Any]:
        """Mobile equivalent of analyze_url. Captures current screen via
        `maestro hierarchy` and surfaces interactive elements as modules.
    
        Requires:
          - Maestro CLI installed (https://maestro.mobile.dev)
          - A simulator / emulator / device booted with the target app foregrounded
    
        Args:
          app_id: Optional. When `launch_app=True`, launches this bundle id first.
          launch_app: When True + app_id given, runs `launchApp` before hierarchy.
          timeout_ms: Subprocess timeout for the hierarchy dump.
    
        Returns same shape as analyze_url (`modules` + `candidate_tcs` per module),
        plus a `screen_summary` describing what was found.
        """
        if not shutil.which("maestro"):
            return {
                "error": "maestro CLI 找不到。安裝:curl -fsSL https://get.maestro.mobile.dev | bash",
            }
    
        # Remote-ADB endpoint (BlueStacks / Genymotion / cloud farm). Best-effort:
        # surface failure as a hint but still let Maestro try — a local emulator
        # may also be booted alongside the configured host.
        from ..config import ANDROID_HOST, connect_android_host
        android_host_ok, android_host_msg = connect_android_host()
        # BlueStacks `hierarchy` over TCP-ADB is typically 2–3× slower than a
        # local emulator. Quietly raise the floor when a remote host is in play
        # so the default 30s ceiling doesn't false-positive a timeout.
        if ANDROID_HOST and timeout_ms < 60000:
            timeout_ms = 60000
    
        # Optional: launch the app first so hierarchy reflects its starting screen.
        # We write the launch flow to a temp file because `maestro test -` (stdin)
        # behaved inconsistently across versions; temp-file is the well-trodden
        # path.
        if app_id and launch_app:
            import os as _os
            import tempfile
            tmp = tempfile.NamedTemporaryFile(
                mode="w", suffix=".yaml", delete=False, encoding="utf-8",
            )
            try:
                tmp.write(
                    f"appId: {app_id}\n"
                    "---\n"
                    "- launchApp:\n"
                    "    clearState: false\n"
                    "- waitForAnimationToEnd:\n"
                    "    timeout: 5000\n"
                )
                tmp.close()
                subprocess.run(
                    ["maestro", "test", tmp.name],
                    capture_output=True,
                    text=True,
                    timeout=timeout_ms / 1000 + 10,
                )
            except subprocess.TimeoutExpired:
                return {"error": "launch app 逾時"}
            except OSError as e:
                return {"error": f"無法啟動 app:{type(e).__name__}: {e}"}
            finally:
                try:
                    _os.unlink(tmp.name)
                except OSError:
                    pass
    
        # Pull current screen hierarchy.
        try:
            result = subprocess.run(
                ["maestro", "hierarchy"],
                capture_output=True,
                text=True,
                timeout=timeout_ms / 1000,
            )
        except subprocess.TimeoutExpired:
            hint = ""
            if ANDROID_HOST:
                hint = (
                    f" QA_ANDROID_HOST={ANDROID_HOST}(BlueStacks / 遠端 ADB)— "
                    f"adb 連線狀態:{'OK' if android_host_ok else android_host_msg or 'failed'}。"
                    "建議:1) 確認 BlueStacks 已開機且 Android Debug Bridge 已啟用 "
                    "(Settings → Advanced → Android Debug Bridge: ON);"
                    "2) `adb devices` 應列出該 host;3) 重啟 BlueStacks 後重跑。"
                )
            return {"error": f"maestro hierarchy 逾時 — simulator 沒回應或無 booted device。{hint}"}
        except OSError as e:
            return {"error": f"執行 maestro 失敗:{type(e).__name__}: {e}"}
    
        if result.returncode != 0:
            return {
                "error": "maestro hierarchy 失敗",
                "stderr_tail": (result.stderr or "")[-500:],
            }
    
        # Strip preamble lines ("None:" / device label) — JSON starts at the first `{`.
        raw = result.stdout
        brace = raw.find("{")
        if brace < 0:
            return {"error": "hierarchy 輸出無 JSON 主體", "stdout_tail": raw[-500:]}
        try:
            tree = _json.loads(raw[brace:])
        except _json.JSONDecodeError as e:
            return {"error": f"JSON 解析失敗:{e}", "stdout_tail": raw[brace:brace + 500]}
    
        nodes = []
        _walk_screen(tree, nodes, depth=0)
        modules, summary = _build_screen_modules(nodes)
    
        out: dict[str, Any] = {
            "app_id": app_id,
            "scanned_at": datetime.now().isoformat(timespec="seconds"),
            "module_count": len(modules),
            "modules": modules,
            "screen_summary": summary,
        }
        if ANDROID_HOST:
            out["android_host"] = ANDROID_HOST
            out["android_host_connected"] = android_host_ok
            if android_host_msg:
                out["android_host_message"] = android_host_msg
        return out
  • Dispatch handler in _dispatch() that calls analyzer.analyze_screen via asyncio.to_thread, logs telemetry, and returns the result as JSON TextContent.
    if name == "analyze_screen":
        # Sync subprocess call — wrapped in to_thread so it doesn't block the
        # MCP server's asyncio loop while maestro CLI runs.
        result = await asyncio.to_thread(
            analyzer.analyze_screen,
            args.get("app_id"),
            args.get("launch_app", False),
            args.get("timeout_ms", 30000),
        )
        # Telemetry: log discovered modules with the app_id as the "url"
        # so the optimizer's coverage-gap analysis covers mobile screens too.
        if isinstance(result, dict) and "error" not in result:
            telemetry.log_discovered_modules(
                args.get("app_id") or "screen", result.get("modules", []),
            )
        return [TextContent(type="text", text=json.dumps(result, ensure_ascii=False, indent=2))]
  • Tool registration in list_tools() with name='analyze_screen', description, and inputSchema (app_id, launch_app, timeout_ms).
    Tool(
        name="analyze_screen",
        description=(
            "Mobile 版的 analyze_url:透過 `maestro hierarchy` dump 當前 iOS Simulator / "
            "Android Emulator / 實體機 / BlueStacks(透過 QA_ANDROID_HOST)前景 app 的 view tree,"
            "再分類成 form(具 hint_text 的輸入欄位)、cta(enabled + 有文字的可點元件)、"
            "tab_bar(selected 狀態 + 同 y 對齊的 2+ 個 tab)三種 modules 並附 candidate_tcs。"
            "內建 noise filter 自動排除 iOS 狀態列 + asset 命名標籤(bg_* / *_filled / 純數字 / "
            "單一 ASCII 字元等)讓結果信號集中。需 Maestro CLI 已裝、裝置 booted、app 已在前景。"
            "若給 app_id + launch_app=true,會先用 launchApp 啟動再 dump。"
        ),
        inputSchema={
            "type": "object",
            "properties": {
                "app_id": {
                    "type": "string",
                    "description": (
                        "選填,bundle id (iOS) / package name (Android),"
                        "格式如 `com.example.app`。搭配 launch_app=true 使用,"
                        "或為了在輸出標註是分析哪個 app。"
                    ),
                },
                "launch_app": {
                    "type": "boolean",
                    "default": False,
                    "description": (
                        "搭配 app_id:True 時在 hierarchy dump 前用 maestro launchApp 啟動 app。"
                        "用 clearState: false(保留 app 狀態),確保看到「真實」起始畫面。"
                        "省略則假設裝置上 app 已是當前前景。"
                    ),
                },
                "timeout_ms": {
                    "type": "integer",
                    "default": 30000,
                    "description": (
                        "選填,hierarchy 命令超時毫秒。預設 30000;"
                        "BlueStacks / 遠端 ADB 較慢,QA_ANDROID_HOST 有設時會自動拉到 60000 起跳。"
                    ),
                },
            },
        },
    ),
  • Noise filtering heuristics for analyze_screen: regex patterns and _is_noise_text() to discard asset names (bg_*, ic_*, *_filled), punctuation-only, numeric-only, and single ASCII chars.
    _NOISE_PREFIX_RE = re.compile(r"^(bg_|ic_|icon_|img_|image_)")
    _NOISE_SUFFIX_RE = re.compile(r"(_filled|_outline|_image|_logo|_brand_logo|_active|_inactive)$")
    _NOISE_PUNCT_RE = re.compile(r"^[-_.,\s ]+$")
    _NOISE_NUM_ONLY_RE = re.compile(r"^[\d.,\-\+%元$]+$")
    
    
    def _is_noise_text(text: str) -> bool:
        """Return True for labels that look like asset names / placeholders
        rather than user-facing CTA copy."""
        t = (text or "").strip()
        if not t:
            return True
        # Single ASCII character (e.g. "x", "+") is almost never a real button
        # in a CJK app; single Chinese characters can be (e.g. 「我」) so we
        # only filter single-char when ASCII.
        if len(t) == 1 and t.isascii():
            return True
        if _NOISE_PUNCT_RE.match(t):
            return True
        if _NOISE_NUM_ONLY_RE.match(t):
            return True
        if _NOISE_PREFIX_RE.search(t):
            return True
        if _NOISE_SUFFIX_RE.search(t):
            return True
        return False
  • Helper _walk_screen() that recursively flattens the Maestro hierarchy JSON into attribute dicts for downstream module classification.
    def _walk_screen(node: dict, out: list, depth: int) -> None:
        """Flatten the Maestro hierarchy tree into a list of attribute dicts.
    
        Maestro nests view containers heavily — we keep every node with any
        interactive signal (text / accessibilityText / hintText / resource-id)
        plus its bounds for downstream classification.
        """
        if not isinstance(node, dict) or depth > 60:
            return
        attrs = node.get("attributes") or {}
        if isinstance(attrs, dict):
            flat = {
                "text": (attrs.get("text") or "").strip(),
                "accessibility_text": (attrs.get("accessibilityText") or "").strip(),
                "hint_text": (attrs.get("hintText") or "").strip(),
                "title": (attrs.get("title") or "").strip(),
                "value": (attrs.get("value") or "").strip(),
                "resource_id": (attrs.get("resource-id") or "").strip(),
                "bounds": attrs.get("bounds") or "",
                "enabled": (attrs.get("enabled") or "false").lower() == "true",
                "focused": (attrs.get("focused") or "false").lower() == "true",
                "selected": (attrs.get("selected") or "false").lower() == "true",
                "checked": (attrs.get("checked") or "false").lower() == "true",
                "depth": depth,
            }
            # Keep nodes with at least one identifying signal, plus an enabled flag.
            if any([flat["text"], flat["accessibility_text"], flat["hint_text"],
                    flat["title"], flat["resource_id"]]):
                out.append(flat)
        for child in node.get("children") or []:
            _walk_screen(child, out, depth + 1)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It details automatic noise filtering, classification logic, optional app launch, and timeout adjustment for BlueStacks. All key behaviors are disclosed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single paragraph, front-loaded with core purpose, includes all necessary details without redundancy. Could be slightly more structured (e.g., bullet points) but is clear and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description covers output (view tree, modules, candidate_tcs) and dependencies. It is complete for a tool of this complexity, though a note on post-analysis actions could improve.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and description adds value: explains format of app_id, effect of launch_app (clearState: false), and automatic timeout adjustment for remote ADB. This goes beyond schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it's the mobile version of analyze_url, captures view tree, classifies UI elements, and appends candidate test cases. It is distinct from sibling tools like analyze_url (web).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly lists prerequisites (Maestro CLI, device booted, app in foreground) and provides guidance on optional parameters (app_id, launch_app). However, it doesn't explicitly state when NOT to use it or mention alternatives beyond the implicit 'mobile vs web' distinction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kao273183/mk-qa-master'

If you have feedback or need assistance with the MCP directory API, please join our Discord server