Skip to main content
Glama
BrowserGenie

BrowserGenie MCP Server

by BrowserGenie

click_element

Simulate human-like clicks on any page element using CSS selectors, XPath, or coordinates. Supports left, right, middle buttons and double-click.

Instructions

Click on any element on the page. Use this for buttons, links, checkboxes, dropdowns, or any interactive element. Simulates real human mouse behavior with Bézier curve movement, randomized delays, and natural press/release timing.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
targetYesTarget to click - prefer CSS selectors when possible
buttonNoMouse button: left (default), right (context menu), or middle
doubleClickNoSet to true for double-click. Two rapid clicks (~40-100ms apart) are fired at the same position. Useful for opening files, selecting words, or triggering double-click handlers. Does NOT automatically handle focus/select behavior of the element — the element's own event handlers determine that.
tabIdNoTarget tab ID (defaults to active tab)
apiKeyNoAPI key for authentication if enabled

Implementation Reference

  • The actual handler function for the 'click_element' tool. It receives the validated parameters (target, button, doubleClick, tabId, apiKey) and sends a 'click_element' command via the WebSocket bridge to the Chrome extension. On success, it returns a confirmation message with the target details; on failure, it returns an error message.
      async ({ target, button, doubleClick, tabId, apiKey }) => {
        const result = await bridge.sendCommand({
          command: 'click_element',
          params: { target, button, doubleClick },
          tabId,
          apiKey,
        });
        if (!result.success) {
          return { content: [{ type: 'text', text: `Error: ${result.error?.message}` }], isError: true };
        }
        return { content: [{ type: 'text', text: `Clicked on ${target.type === 'coordinates' ? `(${(target.value as any).x}, ${(target.value as any).y})` : target.value}` }] };
      }
    );
  • The targetSchema defines the input validation for locating elements. It accepts type ('coordinates', 'css', 'xpath') and a value (either a string selector or {x, y} coordinate object).
    const targetSchema = z.object({
      type: z.enum(['coordinates', 'css', 'xpath']).describe('Method to locate the element: "css" (most reliable), "xpath", or "coordinates"'),
      value: z.union([
        z.string().describe('CSS selector (e.g., "#submit-button") or XPath expression'),
        z.object({
          x: z.number().describe('X coordinate in pixels from left of viewport'),
          y: z.number().describe('Y coordinate in pixels from top of viewport'),
        }).describe('Exact pixel coordinates'),
      ]).describe('The selector string or {x, y} coordinates object'),
    });
  • The tool parameters schema including target, optional button (left/right/middle), optional doubleClick boolean, optional tabId, and optional apiKey.
    {
      target: targetSchema.describe('Target to click - prefer CSS selectors when possible'),
      button: z.enum(['left', 'right', 'middle']).optional().describe('Mouse button: left (default), right (context menu), or middle'),
      doubleClick: z.boolean().optional().describe('Set to true for double-click. Two rapid clicks (~40-100ms apart) are fired at the same position. Useful for opening files, selecting words, or triggering double-click handlers. Does NOT automatically handle focus/select behavior of the element — the element\'s own event handlers determine that.'),
      tabId: z.number().optional().describe('Target tab ID (defaults to active tab)'),
      apiKey: z.string().optional().describe('API key for authentication if enabled'),
    },
  • The registration function 'registerClickTools' that registers the 'click_element' tool on the MCP server with its description, schema, and handler.
    export function registerClickTools(server: McpServer, bridge: WebSocketBridge) {
      server.tool(
        'click_element',
        'Click on any element on the page. Use this for buttons, links, checkboxes, dropdowns, or any interactive element. Simulates real human mouse behavior with Bézier curve movement, randomized delays, and natural press/release timing.',
        {
          target: targetSchema.describe('Target to click - prefer CSS selectors when possible'),
          button: z.enum(['left', 'right', 'middle']).optional().describe('Mouse button: left (default), right (context menu), or middle'),
          doubleClick: z.boolean().optional().describe('Set to true for double-click. Two rapid clicks (~40-100ms apart) are fired at the same position. Useful for opening files, selecting words, or triggering double-click handlers. Does NOT automatically handle focus/select behavior of the element — the element\'s own event handlers determine that.'),
          tabId: z.number().optional().describe('Target tab ID (defaults to active tab)'),
          apiKey: z.string().optional().describe('API key for authentication if enabled'),
        },
        async ({ target, button, doubleClick, tabId, apiKey }) => {
          const result = await bridge.sendCommand({
            command: 'click_element',
            params: { target, button, doubleClick },
            tabId,
            apiKey,
          });
          if (!result.success) {
            return { content: [{ type: 'text', text: `Error: ${result.error?.message}` }], isError: true };
          }
          return { content: [{ type: 'text', text: `Clicked on ${target.type === 'coordinates' ? `(${(target.value as any).x}, ${(target.value as any).y})` : target.value}` }] };
        }
      );
    }
  • The 'registerAllTools' function in the tools index that calls 'registerClickTools' to register the click_element tool on the MCP server.
    export function registerAllTools(server: McpServer, bridge: WebSocketBridge) {
      registerNavigationTools(server, bridge);
      registerTabManagementTools(server, bridge);
      registerKeyboardTools(server, bridge);
      registerScreenshotTools(server, bridge);
      registerClickTools(server, bridge);
      registerInputTools(server, bridge);
      registerDragDropTools(server, bridge);
      registerHoverTools(server, bridge);
    
      registerDevtoolsSourcesTools(server, bridge);
      registerDevtoolsModifyTools(server, bridge);
      registerDevtoolsNetworkTools(server, bridge);
      registerDevtoolsStorageTools(server, bridge);
      registerDevtoolsConsoleTools(server, bridge);
    
      registerAccessibilityTools(server, bridge);
      registerEmulationTools(server, bridge);
      registerElementTools(server, bridge);
      registerAuditTools(server, bridge);
      registerInteractionTools(server, bridge);
      registerMonitoringTools(server, bridge);
      registerQaTools(server, bridge);
      registerGestureTools(server, bridge);
      registerMacroTools(server, bridge);
      registerVisualRegressionTools(server, bridge);
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses human-like simulation (Bézier curves, randomized delays, natural timing), which adds behavioral context beyond the schema. However, it does not address failure cases (e.g., element not found) or potential side effects (navigation, state changes). Still, the disclosed trait is valuable for agent planning.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the primary purpose, followed by behavioral detail. No extraneous words; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple clicking tool with well-documented parameters, the description covers the main aspects: what it does, when to use it, and simulation behavior. It lacks mention of return values (no output schema) or handling of invisible elements, but is nearly complete for typical use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are fully described in the schema. The description does not add extra meaning to parameters beyond what is already present. Baseline 3 is appropriate as the description focuses on tool behavior, not parameter details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Click on any element on the page') and specifies the types of elements it applies to (buttons, links, checkboxes, dropdowns, or any interactive element). This provides a specific verb-resource pairing and distinguishes it as the general click tool among siblings like double_tap or hover_element.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context by suggesting usage for interactive elements, but does not explicitly differentiate from alternatives like double_tap, long_press, or hover_element. It implies single clicks but lacks explicit when-not-to-use or sibling comparisons.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/BrowserGenie/mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server