Skip to main content
Glama
rahulkr
by rahulkr

get_all_text_on_screen

Extract visible text from Android device screens to verify content matches designs during UI testing and visual QA workflows.

Instructions

Extract all visible text from current screen. Useful for verifying text content matches designs.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
device_serialNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • Handler function decorated with @mcp.tool(), which registers and implements the get_all_text_on_screen tool. Parses UI hierarchy XML to extract all visible text elements with their bounds and positions.
    @mcp.tool()
    def get_all_text_on_screen(device_serial: str | None = None) -> list[dict]:
        """
        Extract all visible text from current screen.
        Useful for verifying text content matches designs.
        """
        xml = get_ui_hierarchy(device_serial)
        texts = []
        
        for match in re.finditer(r'<node[^>]*>', xml):
            node = match.group()
            
            text_match = re.search(r'text="([^"]*)"', node)
            if text_match and text_match.group(1).strip():
                text_info = {'text': text_match.group(1)}
                
                bounds_match = re.search(r'bounds="\[(\d+),(\d+)\]\[(\d+),(\d+)\]"', node)
                if bounds_match:
                    x1, y1 = int(bounds_match.group(1)), int(bounds_match.group(2))
                    x2, y2 = int(bounds_match.group(3)), int(bounds_match.group(4))
                    text_info['bounds'] = {'x1': x1, 'y1': y1, 'x2': x2, 'y2': y2}
                    text_info['position'] = {'x': (x1 + x2) // 2, 'y': (y1 + y2) // 2}
                
                class_match = re.search(r'class="([^"]*)"', node)
                if class_match:
                    text_info['element_type'] = class_match.group(1).split('.')[-1]
                
                texts.append(text_info)
        
        return texts
  • The @mcp.tool() decorator registers the get_all_text_on_screen function as an MCP tool.
    @mcp.tool()
  • Helper tool get_ui_hierarchy used by get_all_text_on_screen to dump UI XML for parsing.
    @mcp.tool()
    def get_ui_hierarchy(device_serial: str | None = None) -> str:
        """
        Dump the complete UI hierarchy as XML.
        Shows all visible elements, their properties, bounds, and content descriptions.
        """
        run_adb(["shell", "uiautomator", "dump", "/sdcard/ui_dump.xml"], device_serial)
        output = run_adb(["shell", "cat", "/sdcard/ui_dump.xml"], device_serial)
        run_adb(["shell", "rm", "/sdcard/ui_dump.xml"], device_serial)
        return output
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions extracting 'visible text' but doesn't specify behavioral traits like whether it requires specific permissions, how it handles dynamic content, error conditions, or performance implications. This leaves gaps in understanding the tool's operation beyond the basic action.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded, with two sentences that directly state the purpose and a usage hint. Every word earns its place, and there's no unnecessary information, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (text extraction from a screen), no annotations, and an output schema present, the description is minimally adequate. It covers the basic purpose and a usage hint but lacks details on behavioral aspects and parameter context, which could be important for effective use by an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has one parameter ('device_serial') with 0% description coverage, and the tool description doesn't mention any parameters. Since there's only one parameter and it's optional (default: null), the baseline is moderate. The description doesn't add meaning beyond the schema, but the simplicity of the parameter list keeps it from scoring lower.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Extract all visible text from current screen.' It specifies the verb ('extract') and resource ('all visible text'), making it easy to understand. However, it doesn't explicitly differentiate from sibling tools like 'get_ui_hierarchy' or 'screenshot', which might also provide text-related data, so it doesn't reach a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implied usage guidance with 'Useful for verifying text content matches designs,' suggesting a context for when to use it. However, it lacks explicit alternatives or exclusions, such as when to choose this over 'get_ui_hierarchy' for text extraction or 'screenshot' for visual verification, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rahulkr/r_adb_mcp_server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server