Skip to main content
Glama

long_press

Simulate a long press at specific coordinates on a simulator for a defined duration. Integrates with XcodeBuildMCP for precise UI interactions.

Instructions

Long press at specific coordinates for given duration (ms). Use describe_ui for precise coordinates (don't guess from screenshots).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
durationYes
simulatorUuidYes
xYes
yYes

Implementation Reference

  • Main handler function that performs the long press using AXe 'touch' command with specified coordinates and duration.
    export async function long_pressLogic(
      params: LongPressParams,
      executor: CommandExecutor,
      axeHelpers: AxeHelpers = {
        getAxePath,
        getBundledAxeEnvironment,
        createAxeNotAvailableResponse,
      },
    ): Promise<ToolResponse> {
      const toolName = 'long_press';
      const { simulatorId, x, y, duration } = params;
      // AXe uses touch command with --down, --up, and --delay for long press
      const delayInSeconds = Number(duration) / 1000; // Convert ms to seconds
      const commandArgs = [
        'touch',
        '-x',
        String(x),
        '-y',
        String(y),
        '--down',
        '--up',
        '--delay',
        String(delayInSeconds),
      ];
    
      log(
        'info',
        `${LOG_PREFIX}/${toolName}: Starting for (${x}, ${y}), ${duration}ms on ${simulatorId}`,
      );
    
      try {
        await executeAxeCommand(commandArgs, simulatorId, 'touch', executor, axeHelpers);
        log('info', `${LOG_PREFIX}/${toolName}: Success for ${simulatorId}`);
    
        const warning = getCoordinateWarning(simulatorId);
        const message = `Long press at (${x}, ${y}) for ${duration}ms simulated successfully.`;
    
        if (warning) {
          return createTextResponse(`${message}\n\n${warning}`);
        }
    
        return createTextResponse(message);
      } catch (error) {
        log('error', `${LOG_PREFIX}/${toolName}: Failed - ${error}`);
        if (error instanceof DependencyError) {
          return axeHelpers.createAxeNotAvailableResponse();
        } else if (error instanceof AxeError) {
          return createErrorResponse(
            `Failed to simulate long press at (${x}, ${y}): ${error.message}`,
            error.axeOutput,
          );
        } else if (error instanceof SystemError) {
          return createErrorResponse(
            `System error executing axe: ${error.message}`,
            error.originalError?.stack,
          );
        }
        return createErrorResponse(
          `An unexpected error occurred: ${error instanceof Error ? error.message : String(error)}`,
        );
      }
    }
  • Zod schema definition for long_press parameters including simulatorId, x, y, duration. Public schema omits simulatorId.
    const longPressSchema = z.object({
      simulatorId: z.string().uuid('Invalid Simulator UUID format'),
      x: z.number().int('X coordinate for the long press'),
      y: z.number().int('Y coordinate for the long press'),
      duration: z.number().positive('Duration of the long press in milliseconds'),
    });
    
    // Use z.infer for type safety
    type LongPressParams = z.infer<typeof longPressSchema>;
    
    const publicSchemaObject = longPressSchema.omit({ simulatorId: true } as const).strict();
  • Tool registration exporting the 'long_press' tool with name, description, schema, and session-aware handler wrapping long_pressLogic.
    export default {
      name: 'long_press',
      description:
        "Long press at specific coordinates for given duration (ms). Use describe_ui for precise coordinates (don't guess from screenshots).",
      schema: publicSchemaObject.shape, // MCP SDK compatibility
      handler: createSessionAwareTool<LongPressParams>({
        internalSchema: longPressSchema as unknown as z.ZodType<LongPressParams>,
        logicFunction: (params: LongPressParams, executor: CommandExecutor) =>
          long_pressLogic(params, executor, {
            getAxePath,
            getBundledAxeEnvironment,
            createAxeNotAvailableResponse,
          }),
        getExecutor: getDefaultCommandExecutor,
        requirements: [{ allOf: ['simulatorId'], message: 'simulatorId is required' }],
      }),
    };
  • Helper function to generate warnings about using describe_ui for coordinates based on session timestamps.
    function getCoordinateWarning(simulatorId: string): string | null {
      const session = describeUITimestamps.get(simulatorId);
      if (!session) {
        return 'Warning: describe_ui has not been called yet. Consider using describe_ui for precise coordinates instead of guessing from screenshots.';
      }
    
      const timeSinceDescribe = Date.now() - session.timestamp;
      if (timeSinceDescribe > DESCRIBE_UI_WARNING_TIMEOUT) {
        const secondsAgo = Math.round(timeSinceDescribe / 1000);
        return `Warning: describe_ui was last called ${secondsAgo} seconds ago. Consider refreshing UI coordinates with describe_ui instead of using potentially stale coordinates.`;
      }
    
      return null;
    }
  • Helper function to execute AXe commands, handling binary path, UDID, environment, and errors. Used by long_pressLogic.
    async function executeAxeCommand(
      commandArgs: string[],
      simulatorId: string,
      commandName: string,
      executor: CommandExecutor = getDefaultCommandExecutor(),
      axeHelpers: AxeHelpers = { getAxePath, getBundledAxeEnvironment, createAxeNotAvailableResponse },
    ): Promise<void> {
      // Get the appropriate axe binary path
      const axeBinary = axeHelpers.getAxePath();
      if (!axeBinary) {
        throw new DependencyError('AXe binary not found');
      }
    
      // Add --udid parameter to all commands
      const fullArgs = [...commandArgs, '--udid', simulatorId];
    
      // Construct the full command array with the axe binary as the first element
      const fullCommand = [axeBinary, ...fullArgs];
    
      try {
        // Determine environment variables for bundled AXe
        const axeEnv = axeBinary !== 'axe' ? axeHelpers.getBundledAxeEnvironment() : undefined;
    
        const result = await executor(fullCommand, `${LOG_PREFIX}: ${commandName}`, false, axeEnv);
    
        if (!result.success) {
          throw new AxeError(
            `axe command '${commandName}' failed.`,
            commandName,
            result.error ?? result.output,
            simulatorId,
          );
        }
    
        // Check for stderr output in successful commands
        if (result.error) {
          log(
            'warn',
            `${LOG_PREFIX}: Command '${commandName}' produced stderr output but exited successfully. Output: ${result.error}`,
          );
        }
    
        // Function now returns void - the calling code creates its own response
      } catch (error) {
        if (error instanceof Error) {
          if (error instanceof AxeError) {
            throw error;
          }
    
          // Otherwise wrap it in a SystemError
          throw new SystemError(`Failed to execute axe command: ${error.message}`, error);
        }
    
        // For any other type of error
        throw new SystemError(`Failed to execute axe command: ${String(error)}`);
      }
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden for behavioral disclosure. While it mentions the need for precise coordinates and references another tool, it doesn't describe what the tool actually does behaviorally (e.g., simulates a long press gesture on a simulator, requires the simulator to be running, might have timing constraints, or what happens if coordinates are invalid). For a tool with 4 parameters and no annotation coverage, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with two sentences that each earn their place: the first states the core functionality, and the second provides critical usage guidance. There's zero waste, and the most important information (what the tool does and how to use it correctly) is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 4 parameters with 0% schema coverage and no annotations or output schema, the description provides good guidance on coordinate sourcing but lacks information about the simulatorUuid parameter, behavioral details, or what happens after execution. It's adequate for basic usage but has clear gaps for a tool that interacts with simulators and requires multiple inputs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It mentions 'specific coordinates' (mapping to x and y) and 'duration (ms)' (mapping to duration), providing semantic meaning for 3 of the 4 parameters. However, it doesn't explain the simulatorUuid parameter at all, leaving one parameter completely undocumented. The description adds value but doesn't fully compensate for the schema coverage gap.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('long press') with precise resources ('at specific coordinates for given duration'), distinguishing it from sibling tools like 'tap', 'swipe', or 'gesture' which involve different interaction patterns. It explicitly mentions the coordinate requirement and duration parameter, making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('for precise coordinates') and when not to use it ('don't guess from screenshots'), directly naming an alternative tool ('describe_ui') for obtaining coordinates. This clearly distinguishes it from other input methods and sets prerequisites for effective usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

  • @Rahulec08/appium-mcp
  • @getsentry/XcodeBuildMCP
  • @getsentry/XcodeBuildMCP
  • @joshuayoes/ios-simulator-mcp
  • @getsentry/XcodeBuildMCP
  • @getsentry/XcodeBuildMCP

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/getsentry/XcodeBuildMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server