Skip to main content
Glama

humanizer_click

Click page elements using CSS/XPath selectors, ARIA roles, visible text, form labels, or screen coordinates. Supports left, right, middle, and double clicks with configurable wait time.

Instructions

Click an element. Pass one of: selector (CSS/XPath), role + optional name, text, label, or raw x+y coords as fallback. Locator-based calls auto-wait for visible.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
target_idYesTarget ID from interceptor_browser_launch or interceptor_camoufox_launch
selectorNoCSS or XPath selector (e.g. 'button.submit', '//button[@id="go"]')
roleNoARIA role (e.g. 'button', 'link', 'textbox')
nameNoAccessible name; used with role (e.g. 'Sign in')
textNoVisible text to match (e.g. 'Accept cookies')
labelNoForm-field label text (e.g. 'Email address')
xNoX coordinate fallback when no locator is given
yNoY coordinate fallback when no locator is given
buttonNoMouse button (default: left)left
click_countNoNumber of clicks (default: 1, use 2 for double-click)
timeout_msNoMax ms to wait for locator to be visible + actionable (default: 15000)

Implementation Reference

  • Registration of the 'humanizer_click' MCP tool via server.tool(), including its Zod schema for input validation and the async handler that delegates to humanizerEngine.click()
    server.tool(
      "humanizer_click",
      "Click an element. Pass one of: selector (CSS/XPath), role + optional name, " +
      "text, label, or raw x+y coords as fallback. Locator-based calls auto-wait for visible.",
      {
        target_id: z.string().describe("Target ID from interceptor_browser_launch or interceptor_camoufox_launch"),
        selector: z.string().optional().describe("CSS or XPath selector (e.g. 'button.submit', '//button[@id=\"go\"]')"),
        role: z.string().optional().describe("ARIA role (e.g. 'button', 'link', 'textbox')"),
        name: z.string().optional().describe("Accessible name; used with role (e.g. 'Sign in')"),
        text: z.string().optional().describe("Visible text to match (e.g. 'Accept cookies')"),
        label: z.string().optional().describe("Form-field label text (e.g. 'Email address')"),
        x: z.number().optional().describe("X coordinate fallback when no locator is given"),
        y: z.number().optional().describe("Y coordinate fallback when no locator is given"),
        button: z.enum(["left", "right", "middle"]).optional().default("left")
          .describe("Mouse button (default: left)"),
        click_count: z.number().optional().default(1)
          .describe("Number of clicks (default: 1, use 2 for double-click)"),
        timeout_ms: z.number().optional().default(15000)
          .describe("Max ms to wait for locator to be visible + actionable (default: 15000)"),
      },
      async ({ target_id, selector, role, name, text, label, x, y, button, click_count, timeout_ms }) => {
        try {
          const result = await humanizerEngine.click(target_id, {
            selector,
            role,
            name,
            text,
            label,
            x,
            y,
            button,
            clickCount: click_count,
            timeoutMs: timeout_ms,
          });
          return {
            content: [{
              type: "text",
              text: JSON.stringify({
                status: "success",
                target_id,
                action: "click",
                resolved_by: result.resolvedBy,
                clicked_at: result.clickedAt,
                button,
                click_count,
                stats: { total_ms: result.totalMs, events_dispatched: result.eventsDispatched },
              }),
            }],
          };
        } catch (e) {
          return { content: [{ type: "text", text: JSON.stringify({ status: "error", target_id, action: "click", error: errorToString(e) }) }] };
        }
      },
    );
  • The core click() method on HumanizerEngine that resolves the element via locator (selector, role, text, label) or coordinate fallback, performs the Playwright click, and returns stats
    async click(
      targetId: string,
      opts: ClickTarget & {
        button?: "left" | "right" | "middle";
        clickCount?: number;
        timeoutMs?: number;
      } = {},
    ): Promise<{ totalMs: number; eventsDispatched: number; clickedAt: Point; resolvedBy: string }> {
      const page = await getPageForTarget(targetId);
      const button = opts.button ?? "left";
      const clickCount = opts.clickCount ?? 1;
      const timeout = opts.timeoutMs ?? 15_000;
      const start = Date.now();
      const resolvedBy = resolvedByLabel(opts);
    
      const locator = resolveLocator(page, opts);
      if (locator) {
        await locator.click({ button, clickCount, timeout });
        const box = await locator.boundingBox({ timeout: 5_000 }).catch(() => null);
        const center: Point = box
          ? { x: box.x + box.width / 2, y: box.y + box.height / 2 }
          : { x: 0, y: 0 };
        const state = getMouseState(targetId);
        state.x = center.x;
        state.y = center.y;
        return { totalMs: Date.now() - start, eventsDispatched: 1, clickedAt: center, resolvedBy };
      }
    
      if (opts.x !== undefined && opts.y !== undefined) {
        await page.mouse.click(opts.x, opts.y, { button, clickCount });
        const state = getMouseState(targetId);
        state.x = opts.x;
        state.y = opts.y;
        return {
          totalMs: Date.now() - start,
          eventsDispatched: 1,
          clickedAt: { x: opts.x, y: opts.y },
          resolvedBy,
        };
      }
    
      throw new Error("Provide one of: selector, role (+ name), text, label, or x+y coordinates.");
    }
  • Zod input schema for humanizer_click tool: target_id, selector, role, name, text, label, x, y, button, click_count, timeout_ms
    {
      target_id: z.string().describe("Target ID from interceptor_browser_launch or interceptor_camoufox_launch"),
      selector: z.string().optional().describe("CSS or XPath selector (e.g. 'button.submit', '//button[@id=\"go\"]')"),
      role: z.string().optional().describe("ARIA role (e.g. 'button', 'link', 'textbox')"),
      name: z.string().optional().describe("Accessible name; used with role (e.g. 'Sign in')"),
      text: z.string().optional().describe("Visible text to match (e.g. 'Accept cookies')"),
      label: z.string().optional().describe("Form-field label text (e.g. 'Email address')"),
      x: z.number().optional().describe("X coordinate fallback when no locator is given"),
      y: z.number().optional().describe("Y coordinate fallback when no locator is given"),
      button: z.enum(["left", "right", "middle"]).optional().default("left")
        .describe("Mouse button (default: left)"),
      click_count: z.number().optional().default(1)
        .describe("Number of clicks (default: 1, use 2 for double-click)"),
      timeout_ms: z.number().optional().default(15000)
        .describe("Max ms to wait for locator to be visible + actionable (default: 15000)"),
    },
  • ClickTarget interface and resolveLocator() / resolvedByLabel() helper functions that translate the abstract click options into Playwright locator calls
    export interface ClickTarget {
      selector?: string;
      role?: string;
      name?: string;
      text?: string;
      label?: string;
      x?: number;
      y?: number;
    }
    
    function resolveLocator(page: Page, opts: ClickTarget): Locator | null {
      if (opts.selector) return page.locator(opts.selector);
      if (opts.role) {
        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        return page.getByRole(opts.role as any, opts.name ? { name: opts.name } : undefined);
      }
      if (opts.text) return page.getByText(opts.text);
      if (opts.label) return page.getByLabel(opts.label);
      return null;
    }
    
    function resolvedByLabel(opts: ClickTarget): string {
      if (opts.selector) return "selector";
      if (opts.role) return "role";
      if (opts.text) return "text";
      if (opts.label) return "label";
      return "coords";
    }
  • getPageForTarget() — resolves any target_id (cloakbrowser or camoufox) to a Playwright Page, used by humanizerEngine.click() to get the page instance
    export async function getPageForTarget(targetId: string): Promise<Page> {
      const entry = getEntry(targetId);
      if (isCamoufoxTargetId(targetId)) {
        return ensureCamoufoxPage(entry as CamoufoxEntryWithDriver);
      }
      const browserEntry = entry as BrowserTargetEntry;
      if (browserEntry.page.isClosed()) {
        throw new Error(`Page for browser target '${targetId}' is closed.`);
      }
      return browserEntry.page;
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses key behavioral traits: 'Locator-based calls auto-wait for visible' and the fallback nature of coordinates. With no annotations provided, the description carries the full burden and does so effectively, though it could mention what happens on failure or if multiple locators are provided.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences with no unnecessary words. It front-loads the core action ('Click an element') and efficiently conveys parameter groupings and fallback behavior.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of 11 parameters and no output schema, the description is reasonably complete. It covers the core mechanic, parameter groupings, and auto-wait. It could be improved by noting error handling or return behavior, but the existing content is sufficient for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds value beyond the input schema by grouping parameters ('Pass one of: selector, role+name, text, label, or raw x+y coords') and explaining mutual exclusivity. It also clarifies the auto-wait behavior tied to locator-based parameters. Since schema coverage is 100%, the baseline is 3, but the description provides additional structural context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Click an element.' It lists multiple ways to specify the target (selector, role+name, text, label, coordinates), leaving no ambiguity about what the tool does. It is distinct from sibling tools like humanizer_move or humanizer_scroll.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly instructs to 'Pass one of' the locator options, making it clear which parameters to use. It also mentions the auto-wait behavior for locator-based calls. However, it does not explicitly state when not to use this tool or provide alternatives for non-click actions, though the tool's name and purpose make this obvious.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/yfe404/proxy-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server