Skip to main content
Glama
zillow
by zillow

pressButton

Press hardware buttons on mobile devices to automate testing and control functions. Supports Android and iOS platforms for reliable device interaction.

Instructions

Press a hardware button on the device

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
buttonYesThe button to press
platformYesPlatform of the device

Implementation Reference

  • Registration of the 'pressButton' tool using ToolRegistry.registerDeviceAware, linking name, description, schema, and handler.
    ToolRegistry.registerDeviceAware(
      "pressButton",
      "Press a hardware button on the device",
      pressButtonSchema,
      pressButtonHandler,
      true // Supports progress notifications
    );
  • The pressButtonHandler function wraps the PressButton class execution, handles errors, and formats the tool response.
    const pressButtonHandler = async (device: BootedDevice, args: PressButtonArgs, progress?: ProgressCallback) => {
      try {
        const pressButton = new PressButton(device);
        const result = await pressButton.execute(args.button, progress); // observe = true
    
        return createJSONToolResponse({
          message: `Pressed button ${args.button}`,
          observation: result.observation,
          ...result
        });
      } catch (error) {
        throw new ActionableError(`Failed to press button: ${error}`);
      }
    };
  • Zod schema defining input arguments for pressButton: button type and platform.
    export const pressButtonSchema = z.object({
      button: z.enum(["home", "back", "menu", "power", "volume_up", "volume_down", "recent"])
        .describe("The button to press"),
      platform: z.enum(["android", "ios"]).describe("Platform of the device")
    });
  • Core PressButton class with execute method implementing platform-specific button presses (ADB keyevent for Android, Axe for iOS).
    export class PressButton extends BaseVisualChange {
      constructor(device: BootedDevice, adb: AdbUtils | null = null, axe: Axe | null = null) {
        super(device, adb, axe);
        this.device = device;
      }
    
      async execute(
        button: string,
        progress?: ProgressCallback
      ): Promise<PressButtonResult> {
        return this.observedInteraction(
          async () => {
            try {
              // Platform-specific button press execution
              switch (this.device.platform) {
                case "android":
                  return await this.executeAndroidButtonPress(button);
                case "ios":
                  return await this.executeiOSButtonPress(button);
                default:
                  throw new Error(`Unsupported platform: ${this.device.platform}`);
              }
            } catch (error) {
              return {
                success: false,
                button,
                keyCode: -1,
                error: `Failed to press button: ${error instanceof Error ? error.message : String(error)}`
              };
            }
          },
          {
            changeExpected: false,
            timeoutMs: 2000, // Reduce timeout for faster execution
            progress
          }
        );
      }
    
      /**
       * Execute Android-specific button press
       * @param button - Button name to press
       * @returns Result of the button press operation
       */
      private async executeAndroidButtonPress(button: string): Promise<PressButtonResult> {
        const keyCodeMap: Record<string, number> = {
          "home": 3,
          "back": 4,
          "menu": 82,
          "power": 26,
          "volume_up": 24,
          "volume_down": 25,
          "recent": 187,
        };
    
        const keyCode = keyCodeMap[button.toLowerCase()];
        if (!keyCode) {
          return {
            success: false,
            button,
            keyCode: -1,
            error: `Unsupported button: ${button}`
          };
        }
    
        await this.adb.executeCommand(`shell input keyevent ${keyCode}`);
    
        return {
          success: true,
          button,
          keyCode
        };
      }
    
      /**
       * Execute iOS-specific button press
       * @param button - Button name to press
       * @returns Result of the button press operation
       */
      private async executeiOSButtonPress(button: string): Promise<PressButtonResult> {
    
        const axeButton: AxeButton = "home";
        const supportedButtons = ["apple_pay", "home", "lock", "side_button", "siri"];
        if (!supportedButtons.includes(button.toLowerCase())) {
          throw new Error(`Unsupported iOS button: ${button}. Supported buttons: ${supportedButtons.join(", ")}`);
        }
    
        await this.axe.pressButton(axeButton);
    
        return {
          success: true,
          button,
          keyCode: 0, // iOS doesn't use keycodes, using 0 as placeholder
        };
      }
    }
  • TypeScript interface defining the return type of pressButton operations.
    export interface PressButtonResult {
      success: boolean;
      button: string;
      keyCode: number;
      observation?: ObserveResult;
      error?: string;
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the action ('press') but doesn't explain what happens after pressing (e.g., device response, side effects, or error conditions). For a hardware interaction tool with zero annotation coverage, this leaves significant gaps in understanding its behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, direct sentence with zero waste—it efficiently conveys the core action without unnecessary details. It's appropriately sized and front-loaded, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a hardware interaction tool with no annotations and no output schema, the description is insufficient. It lacks details on behavioral outcomes, error handling, or dependencies, leaving the agent with incomplete context for safe and effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with both parameters ('button' and 'platform') fully documented in the schema. The description doesn't add any meaning beyond what the schema provides, such as explaining button effects or platform-specific nuances. The baseline score of 3 reflects adequate but minimal value addition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('press') and resource ('hardware button on the device'), making the tool's function immediately understandable. However, it doesn't distinguish this from sibling tools like 'pressKey' or 'tapOn', which might have overlapping functionality, so it doesn't reach the highest score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'pressKey' or 'tapOn'. There's no mention of prerequisites, such as device state or dependencies, or explicit exclusions for when other tools might be more appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/zillow/auto-mobile'

If you have feedback or need assistance with the MCP directory API, please join our Discord server