Skip to main content
Glama

mobile_dump_ui

Extract UI elements from Android screens as structured JSON with parent-child relationships to enable automated testing and interaction analysis.

Instructions

Get UI elements from Android screen as JSON with hierarchical structure.

Returns a JSON structure where elements contain their child elements, showing parent-child relationships. Only includes focusable elements or elements with text/content_desc/hint attributes.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • main.py:99-100 (registration)
    Registration of the 'mobile_dump_ui' tool using the @mcp.tool() decorator, including function signature and docstring describing input/output schema.
    @mcp.tool()
    def mobile_dump_ui() -> str:
  • main.py:100-108 (handler)
    Handler function for mobile_dump_ui tool, checks device initialization and delegates to internal _mobile_dump_ui.
    def mobile_dump_ui() -> str:
        """Get UI elements from Android screen as JSON with hierarchical structure.
        
        Returns a JSON structure where elements contain their child elements, showing parent-child relationships.
        Only includes focusable elements or elements with text/content_desc/hint attributes.
        """
        if device is None:
            return "Error: Device not initialized. Please call mobile_init() first to establish connection with Android device."
        return _mobile_dump_ui()
  • main.py:110-122 (handler)
    Core handler logic: dumps UI hierarchy XML, parses it with ElementTree, extracts UI elements using helper, and returns as string.
    def _mobile_dump_ui():
        try:
            xml_content = device.dump_hierarchy()
            root = ET.fromstring(xml_content)
            
            global current_ui_state
            ui_coords.clear()
    
            ui_elements = extract_ui_elements(root)
            return str(ui_elements)
    
        except Exception as e:
            return f"Error processing XML: {str(e)}"
  • main.py:36-83 (helper)
    Recursive helper to extract UI elements: filters focusable or text-bearing elements, computes center coordinates, builds hierarchical structure with children, collects global UI coordinates.
    def extract_ui_elements(element):
        resource_id = element.get('resource-id', '')
        
        text = element.get('text', '').strip()
        content_desc = element.get('content-desc', '').strip()
        hint = element.get('hint', '').strip()
        bounds = parse_bounds(element.get('bounds', ''))
        focusable = element.get('focusable', 'false').lower() == 'true'
        
        has_text = bool(text or content_desc or hint)
        
        children = []
        for child in element:
            children.extend(extract_ui_elements(child))
    
        if not (focusable or has_text):
            return children
        
        display_text = text or content_desc or hint
        if focusable and not display_text:
            child_texts = get_children_texts(element)
            display_text = ' '.join(child_texts).strip()
        
        element_info = {
            "text": display_text,
            "class": element.get('class', ''),
            "coordinates": {"x": bounds["x"], "y": bounds["y"]} if bounds else None
        }
    
        global ui_coords
        ui_coords.add((bounds["x"], bounds["y"]))
    
        if resource_id:
            element_info["resource_id"] = resource_id
        
        if children:
            filtered_children = []
            for child in children:
                child_text = child.get("text", "")
                child_coords = child.get("coordinates")
    
                if not (child_text == element_info["text"] and child_coords == element_info["coordinates"]):
                    filtered_children.append(child)
    
            if filtered_children:
                element_info["children"] = filtered_children
        
        return [element_info]
  • main.py:14-24 (helper)
    Helper to parse bounds string from XML to center coordinates used in UI elements.
    def parse_bounds(bounds_str):
        if not bounds_str or bounds_str == '':
            return None
        try:
            bounds = bounds_str.replace('[', '').replace(']', ',').split(',')
            x1, y1, x2, y2 = map(int, bounds[:4])
            center_x = (x1 + x2) // 2
            center_y = (y1 + y2) // 2
            return {"x": center_x, "y": center_y, "bounds": [x1, y1, x2, y2]}
        except:
            return None
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It describes the output format and filtering criteria ('Only includes focusable elements or elements with text/content_desc/hint attributes'), which adds useful context beyond basic functionality. However, it lacks details on performance, error handling, or dependencies (e.g., requiring an active Android session).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by clarifying details in subsequent sentences. Each sentence adds value: the first defines the action and output, the second explains the hierarchical structure, and the third specifies inclusion criteria. There is no wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (UI element extraction), no annotations, and an output schema present, the description is reasonably complete. It covers the purpose, output format, and filtering logic. However, it could benefit from mentioning prerequisites (e.g., device connectivity) or limitations (e.g., screen state requirements), slightly reducing completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are 0 parameters, and schema description coverage is 100%, so no parameter documentation is needed. The description does not discuss parameters, which is appropriate here. A baseline of 4 is applied as it efficiently handles the lack of parameters without redundancy.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verb ('Get UI elements') and resource ('from Android screen'), and distinguishes it from siblings by focusing on UI element extraction rather than interaction (e.g., mobile_click) or system actions (e.g., mobile_launch_app). It specifies the output format ('as JSON with hierarchical structure'), which is unique among the listed tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for retrieving UI elements in a structured format, but does not explicitly state when to use this tool versus alternatives like mobile_take_screenshot for visual capture or mobile_list_apps for app enumeration. No exclusions or prerequisites are mentioned, leaving usage context somewhat open-ended.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/erichung9060/Android-Mobile-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server