Android MCP Server

find_element

Locate UI elements on Android screens using selectors like text, resource ID, or class name to identify buttons, text fields, or other components for automation tasks.

Instructions

Find a UI element on the Android screen matching the given selector criteria. Returns the first matching element with its properties and bounds. Use this to locate buttons, text fields, or any UI component before interacting with it.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`selector`	Yes	Element selector with one or more criteria
`device_id`	No	Device serial number

Implementation Reference

src/controllers/ui-tools.ts:67-103 (handler)

The MCP tool registration and request handler for 'find_element'. It takes a selector and returns matching elements.

server.registerTool(
  'find_element',
  {
    description: 'Find a UI element on the Android screen matching the given selector criteria. Returns the first matching element with its properties and bounds. Use this to locate buttons, text fields, or any UI component before interacting with it.',
    inputSchema: {
      selector: selectorSchema.describe('Element selector with one or more criteria'),
      device_id: z.string().optional().describe('Device serial number'),
    },
  },
  async ({ selector, device_id }) => {
    return await metrics.measure('find_element', device_id || 'default', async () => {
      try {
        const elements = await findElements(selector as ElementSelector, device_id);
        return {
          content: [{
            type: 'text' as const,
            text: JSON.stringify({
              success: true,
              found: elements.length,
              elements: elements.slice(0, 10), // Limit to first 10
            }, null, 2),
          }],
        };
      } catch (error) {
        return {
          content: [{
            type: 'text' as const,
            text: JSON.stringify({
              success: false,
              error: error instanceof Error ? error.message : String(error),
            }, null, 2),
          }],
        };
      }
    });
  }
);

src/uiautomator/element-finder.ts:83-97 (handler)

The core implementation of the findElement logic. It calls findElements and throws an error if no element is found.

export async function findElement(
  selector: ElementSelector,
  deviceId?: string
): Promise<FoundElement> {
  const found = await findElements(selector, deviceId);

  if (found.length === 0) {
    const selectorStr = Object.entries(selector)
      .map(([k, v]) => `${k}="${v}"`)
      .join(', ');
    throw new ElementNotFoundError(selectorStr, 'compound');
  }

  return found[0];
}

src/controllers/ui-tools.ts:8-18 (schema)

Input schema validation for the element selector used in 'find_element'.

const selectorSchema = z.object({
  text: z.string().optional().describe('Exact text match'),
  textContains: z.string().optional().describe('Partial text match (case-insensitive)'),
  resourceId: z.string().optional().describe('Resource ID (e.g. com.app:id/button)'),
  className: z.string().optional().describe('Class name (e.g. android.widget.Button)'),
  contentDesc: z.string().optional().describe('Content description (accessibility label)'),
  contentDescContains: z.string().optional().describe('Partial content description match'),
  clickable: z.boolean().optional().describe('Filter by clickable state'),
  enabled: z.boolean().optional().describe('Filter by enabled state'),
  packageName: z.string().optional().describe('Filter by package name'),
});

Tool Definition Quality

A4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It effectively discloses key behavioral traits: returns only the first match, returns properties and bounds, and implies a read-only operation. Could improve by mentioning error behavior when no element is found.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences: first defines the action, second specifies return behavior (first match, properties/bounds), third provides usage context. Zero redundancy; every sentence earns its place with information not duplicated in schema or annotations.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Compensates well for missing output schema by describing return values (properties and bounds). Covers the essential behavioral context for a 2-parameter tool with 100% input schema coverage. Minor gap regarding error handling when no match exists.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, providing detailed descriptions for all selector sub-properties and device_id. The description references 'selector criteria' which aligns with the schema but adds minimal semantic meaning beyond the structured definitions. Baseline score appropriate given schema completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the specific action (Find), resource (UI element on Android screen), and mechanism (selector criteria). It distinguishes from interaction siblings like click_element by emphasizing this is for locating components before interacting.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides implicit workflow guidance ('before interacting with it'), suggesting when to use it in the sequence. However, it lacks explicit alternatives or exclusions (e.g., when to use wait_for_element instead, or when to use detect_elements_visually for visual matching).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/divineDev-dotcom/android_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server