get_ui_elements
Query visible UI elements in macOS applications using Accessibility API to retrieve roles, positions, sizes, and text content for desktop automation tasks.
Instructions
Query visible UI elements of an application via macOS Accessibility API. Returns element roles, titles, positions (screen coordinates), sizes, and states. May return text content from visible UI elements including sensitive data (passwords in non-secure fields, messages, etc.). Positions are in logical screen coordinates — pass directly to click tool. Coverage varies: native apps expose rich trees; Electron/web apps may expose partial trees; games/custom UIs may expose nothing. Requires Accessibility permission.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| app | No | Target application name. Default: frontmost app. Supports fuzzy matching. | |
| role | No | Filter by AX role: "AXButton", "AXTextField", "AXStaticText", etc. | |
| title | No | Filter by element title (substring, case-insensitive). | |
| max_depth | Yes | Max tree traversal depth (default: 5). |
Implementation Reference
- src/tools/accessibility.ts:62-86 (handler)Handler function for the get_ui_elements tool.
async function handleGetUIElements( args: Record<string, unknown>, ): Promise<CallToolResult> { const parsed = GetUIElementsInputSchema.parse(args); const helperArgs: Record<string, unknown> = { max_depth: parsed.max_depth, }; if (parsed.app) { helperArgs.app = await resolveAppName(parsed.app); } if (parsed.role) helperArgs.role = parsed.role; if (parsed.title) helperArgs.title = parsed.title; const response = await runInputHelper("get_ui_elements", helperArgs); return { content: [ { type: "text" as const, text: JSON.stringify(response, null, 2), }, ], }; } - src/tools/accessibility.ts:10-37 (schema)Input schema definition for get_ui_elements tool.
const GetUIElementsInputSchema = z.object({ app: z .string() .max(1_000) .optional() .describe( "Target application name. Default: frontmost app. Supports fuzzy matching.", ), role: z .string() .max(200) .optional() .describe( 'Filter by AX role: "AXButton", "AXTextField", "AXStaticText", etc.', ), title: z .string() .max(1_000) .optional() .describe("Filter by element title (substring, case-insensitive)."), max_depth: z .number() .int() .min(1) .max(10) .default(5) .describe("Max tree traversal depth (default: 5)."), }); - src/tools/accessibility.ts:41-57 (registration)Tool definition registration for get_ui_elements.
export const accessibilityToolDefinitions: Tool[] = [ { name: "get_ui_elements", description: "Query visible UI elements of an application via macOS Accessibility API. " + "Returns element roles, titles, positions (screen coordinates), sizes, and states. " + "May return text content from visible UI elements including sensitive data (passwords in non-secure fields, messages, etc.). " + "Positions are in logical screen coordinates — pass directly to click tool. " + "Coverage varies: native apps expose rich trees; Electron/web apps may expose partial trees; " + "games/custom UIs may expose nothing. Requires Accessibility permission.", inputSchema: zodToToolInputSchema(GetUIElementsInputSchema), annotations: { readOnlyHint: true, destructiveHint: false, }, }, ]; - src/tools/accessibility.ts:91-96 (registration)Handler dispatcher registration for get_ui_elements.
export const accessibilityToolHandlers: Record< string, (args: Record<string, unknown>) => Promise<CallToolResult> > = { get_ui_elements: (args) => enqueue(() => handleGetUIElements(args)), };