Skip to main content
Glama
conorluddy

XC-MCP: XCode CLI wrapper

by conorluddy

simctl-io

Capture optimized screenshots or record videos from iOS simulators. Resizes images to tile-aligned dimensions for token efficiency. Supports semantic naming for AI reasoning. Choose codec, size, and optional metadata.

Instructions

simctl-io

Capture screenshots or record videos from iOS simulators with automatic optimization.

What it does

Captures simulator screen as optimized PNG images or records video with configurable codecs. Screenshots are automatically resized to tile-aligned dimensions for token efficiency and support semantic naming for AI agent reasoning.

Parameters

  • udid (string, optional): Simulator UDID (auto-detects booted device if omitted)

  • operation (string, required): "screenshot" or "video"

  • outputPath (string, optional): Custom file path (auto-generated if omitted)

  • codec (string, optional): Video codec - h264, hevc, or prores (default: h264)

  • size (string, optional): Screenshot size - half, full, quarter, thumb (default: half)

  • appName (string, optional): App name for semantic naming

  • screenName (string, optional): Screen/view name for semantic naming

  • state (string, optional): UI state for semantic naming

Screenshot Size Optimization

Screenshots are automatically optimized for token efficiency:

  • half (default): 256×512 pixels, 1 tile, 170 tokens (50% savings)

  • full: Native resolution, 2 tiles, 340 tokens

  • quarter: 128×256 pixels, 1 tile, 170 tokens

  • thumb: 128×128 pixels, 1 tile, 170 tokens

Semantic Naming (LLM Optimization)

Provide appName, screenName, and state to generate semantic filenames:

  • Format: {appName}_{screenName}_{state}_{date}.png

  • Example: MyApp_LoginScreen_Empty_2025-01-23.png

  • Enables AI agents to reason about screen context and track state progression

Returns

JSON response with:

  • File path and size information

  • Screenshot optimization metadata (dimensions, token count, savings)

  • Coordinate transform for mapping resized coordinates to device

  • Semantic metadata when provided

  • Guidance for viewing and using the capture

Examples

Capture optimized screenshot (default 256×512)

await simctlIoTool({
  udid: 'device-123',
  operation: 'screenshot'
})

Capture full-size screenshot

await simctlIoTool({
  udid: 'device-123',
  operation: 'screenshot',
  size: 'full'
})

Capture with semantic naming

await simctlIoTool({
  udid: 'device-123',
  operation: 'screenshot',
  appName: 'MyApp',
  screenName: 'LoginScreen',
  state: 'Empty'
})

Record video with custom codec

await simctlIoTool({
  udid: 'device-123',
  operation: 'video',
  codec: 'hevc'
})

Common Use Cases

  1. UI testing: Capture screenshots for visual regression testing

  2. Bug reporting: Record videos demonstrating issues

  3. Documentation: Create screenshots for app documentation

  4. State tracking: Use semantic naming to track UI state progression

  5. Token optimization: Use half/quarter sizes for LLM-based analysis

Coordinate Transform

When screenshots are resized (size ≠ 'full'), a coordinate transform is provided:

  • scaleX: Multiply screenshot X coordinates by this to get device coordinates

  • scaleY: Multiply screenshot Y coordinates by this to get device coordinates

  • guidance: Human-readable scaling instructions

This enables accurate element tapping even with optimized screenshots.

Important Notes

  • Auto-detection: If udid is omitted, automatically uses the booted device

  • Temp files: Screenshots saved to /tmp unless custom path specified

  • Video recording: Press Ctrl+C to stop video recording

  • Simulator must be booted: Operations require running simulator

  • File permissions: Ensure output path is writable

Error Handling

  • Simulator not booted: Indicates simulator must be booted first

  • Simulator not found: Validates simulator exists in cache

  • File path errors: Reports if output path is not writable

  • Invalid operation: Validates operation is "screenshot" or "video"

Next Steps After Capture

  1. View screenshot: open "<file-path>"

  2. Copy to clipboard: pbcopy < "<file-path>"

  3. Analyze with LLM: Use optimized size for token-efficient analysis

  4. Use coordinates: Apply transform to map screenshot coords to device

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
udidYes
operationYes
outputPathNo
sizeNo
codecNo
appNameNo
screenNameNo
stateNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, but description thoroughly discloses auto-detection, temp file behavior, video stopping, prerequisite (booted simulator), file permissions, error handling, and return value structure. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-organized with clear headings and subheadings. Every section adds value, from parameter details to examples and next steps. Despite length, it remains focused and front-loaded with core functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, no output schema), the description covers parameters, return values, error handling, coordinate transform, and optimization. No major gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 0% description coverage, but the description fully explains each parameter's purpose, defaults, and constraints. Goes beyond schema by providing default values and semantic naming details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it captures screenshots or records videos from iOS simulators with optimization. The verb 'capture' and resources 'screenshots' and 'videos' are specific. Implicitly distinguishes from sibling 'screenshot' by adding video and optimization details.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides common use cases and important notes, but does not explicitly compare with sibling tools like 'screenshot'. However, the detailed feature description implies when to use this advanced tool over simpler alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/conorluddy/xc-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server