ui.interact
Automate browser interactions by clicking, typing, selecting, or scrolling elements using semantic selectors like data-testid, role, label, or text, with automatic visibility and enabled state handling.
Instructions
Interact with the active browser tab using semantic selectors (data-testid, role+name, label, placeholder, name, text, css, xpath). Supports actions: click, type, select, check/uncheck, keypress, hover, waitForSelector, scroll. Automatically scrolls into view and waits for visibility/enabled. Uses a CDP fallback in the extension when needed.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| action | Yes | ||
| target | Yes | How to locate the element | |
| scopeTarget | No | Optional container to scope the search (e.g., role=tablist) | |
| value | No | Text/value to type/select/keypress when applicable | |
| options | No |
Implementation Reference
- browser-tools-mcp/mcp-server.ts:620-673 (handler)Handler function that executes the ui.interact tool by sending POST request with parameters to the browser server's /dom-action endpoint and formats the response.async function handleUiInteract(params: any) { return await withServerConnection(async () => { try { const targetUrl = `http://${discoveredHost}:${discoveredPort}/dom-action`; const response = await fetch(targetUrl, { method: "POST", headers: { "Content-Type": "application/json", }, body: JSON.stringify(params), }); const result = await response.json(); if (response.ok && result.success) { const texts: any[] = [ { type: "text", text: `✅ Action '${params.action}' succeeded on target (${params.target.by}=${params.target.value}).`, }, ]; if (result.details) { texts.push({ type: "text", text: JSON.stringify(result.details, null, 2), }); } return { content: texts }; } else { return { content: [ { type: "text", text: `❌ Action '${params.action}' failed: ${ result.error || "Unknown error" }`, }, ], isError: true, } as any; } } catch (error: any) { const errorMessage = error instanceof Error ? error.message : String(error); return { content: [ { type: "text", text: `Failed to perform interaction: ${errorMessage}`, }, ], isError: true, } as any; } }); }
- browser-tools-mcp/mcp-server.ts:676-767 (registration)Registration of the "ui.interact" tool on the MCP server, including detailed input schema using Zod for validation and description.server.tool( "ui.interact", "Interact with the active browser tab using semantic selectors (data-testid, role+name, label, placeholder, name, text, css, xpath). Supports actions: click, type, select, check/uncheck, keypress, hover, waitForSelector, scroll. Automatically scrolls into view and waits for visibility/enabled. Uses a CDP fallback in the extension when needed.", { action: z.enum([ "click", "type", "select", "check", "uncheck", "keypress", "hover", "waitForSelector", "scroll", ]), target: z .object({ by: z.enum([ "testid", "role", "label", "text", "placeholder", "name", "css", "xpath", ]), value: z.string(), exact: z.boolean().optional(), }) .describe("How to locate the element"), scopeTarget: z .object({ by: z.enum([ "testid", "role", "label", "text", "placeholder", "name", "css", "xpath", ]), value: z.string(), exact: z.boolean().optional(), }) .optional() .describe("Optional container to scope the search (e.g., role=tablist)"), value: z .string() .optional() .describe("Text/value to type/select/keypress when applicable"), options: z .object({ timeoutMs: z.number().optional(), waitForVisible: z.boolean().optional(), waitForEnabled: z.boolean().optional(), waitForNetworkIdleMs: z.number().optional(), postActionScreenshot: z.boolean().optional(), screenshotLabel: z.string().optional(), fallbackToCdp: z.boolean().optional(), frameSelector: z.string().optional(), // scroll-specific scrollX: z.number().optional(), scrollY: z.number().optional(), to: z.enum(["top", "bottom"]).optional(), smooth: z.boolean().optional(), // assertion helpers assertTarget: z .object({ by: z.enum([ "testid", "role", "label", "text", "placeholder", "name", "css", "xpath", ]), value: z.string(), exact: z.boolean().optional(), }) .optional(), assertTimeoutMs: z.number().optional(), assertUrlContains: z.string().optional(), tabChangeWaitMs: z.number().optional(), }) .optional(), }, handleUiInteract );
- Input schema definition for the ui.interact tool using Zod, specifying parameters like action, target selector, optional scope, value, and extensive options for timeouts, assertions, etc.{ action: z.enum([ "click", "type", "select", "check", "uncheck", "keypress", "hover", "waitForSelector", "scroll", ]), target: z .object({ by: z.enum([ "testid", "role", "label", "text", "placeholder", "name", "css", "xpath", ]), value: z.string(), exact: z.boolean().optional(), }) .describe("How to locate the element"), scopeTarget: z .object({ by: z.enum([ "testid", "role", "label", "text", "placeholder", "name", "css", "xpath", ]), value: z.string(), exact: z.boolean().optional(), }) .optional() .describe("Optional container to scope the search (e.g., role=tablist)"), value: z .string() .optional() .describe("Text/value to type/select/keypress when applicable"), options: z .object({ timeoutMs: z.number().optional(), waitForVisible: z.boolean().optional(), waitForEnabled: z.boolean().optional(), waitForNetworkIdleMs: z.number().optional(), postActionScreenshot: z.boolean().optional(), screenshotLabel: z.string().optional(), fallbackToCdp: z.boolean().optional(), frameSelector: z.string().optional(), // scroll-specific scrollX: z.number().optional(), scrollY: z.number().optional(), to: z.enum(["top", "bottom"]).optional(), smooth: z.boolean().optional(), // assertion helpers assertTarget: z .object({ by: z.enum([ "testid", "role", "label", "text", "placeholder", "name", "css", "xpath", ]), value: z.string(), exact: z.boolean().optional(), }) .optional(), assertTimeoutMs: z.number().optional(), assertUrlContains: z.string().optional(), tabChangeWaitMs: z.number().optional(), }) .optional(), },