Skip to main content
Glama
BrowserGenie

BrowserGenie MCP Server

by BrowserGenie

find_element

Locate any element on a webpage using visible text, ARIA role, label, CSS selector, or XPath. Returns a unique CSS selector and element details for interaction.

Instructions

Find an element by visible text, ARIA role, ARIA label, CSS selector, or XPath. Returns a unique CSS selector and element details. Use this when you need to locate an element but don't know its exact selector. Role matching supports both explicit role="..." attributes AND semantic HTML (e.g., matches role "button", matches role "link"). When multiple filters are provided (e.g., role + text), they are combined with AND logic.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
textNoFind by visible text content (substring match). Can be combined with role for precise targeting.
roleNoFind by ARIA role (e.g., "button", "link", "navigation", "main"). Falls back to semantic HTML roles if no explicit role attribute is present.
ariaLabelNoFind by aria-label attribute (substring match)
cssNoCSS selector (most direct — bypasses all other filters)
xpathNoXPath expression (bypasses all other filters)
nthNoSelect the Nth match (0-indexed), default: 0
tabIdNoTarget tab ID (defaults to currently active tab)
apiKeyNoAPI key for authentication if enabled

Implementation Reference

  • Handler function for the find_element tool. It sends a 'find_element' command via the WebSocket bridge with params (text, role, ariaLabel, css, xpath, nth) and returns the result.
      async ({ text, role, ariaLabel, css, xpath, nth, tabId, apiKey }) => {
        const result = await bridge.sendCommand({
          command: 'find_element',
          params: { text, role, ariaLabel, css, xpath, nth },
          tabId,
          apiKey,
          timeout: LONG_TIMEOUT,
        });
        if (!result.success) {
          return { content: [{ type: 'text', text: `Error: ${result.error?.message}` }], isError: true };
        }
        return { content: [{ type: 'text', text: JSON.stringify(result.data, null, 2) }] };
      }
    );
  • Zod schema defining the input parameters for the find_element tool: text, role, ariaLabel, css, xpath (all optional strings), nth (optional number), tabId (optional number), apiKey (optional string).
    {
      text: z.string().optional().describe('Find by visible text content (substring match). Can be combined with role for precise targeting.'),
      role: z.string().optional().describe('Find by ARIA role (e.g., "button", "link", "navigation", "main"). Falls back to semantic HTML roles if no explicit role attribute is present.'),
      ariaLabel: z.string().optional().describe('Find by aria-label attribute (substring match)'),
      css: z.string().optional().describe('CSS selector (most direct — bypasses all other filters)'),
      xpath: z.string().optional().describe('XPath expression (bypasses all other filters)'),
      nth: z.number().optional().describe('Select the Nth match (0-indexed), default: 0'),
      tabId: z.number().optional().describe('Target tab ID (defaults to currently active tab)'),
      apiKey: z.string().optional().describe('API key for authentication if enabled'),
    },
  • Registration of find_element tool on the MCP server via server.tool() inside registerElementTools(). Also registered via registerElementTools() call in src/tools/index.ts line 47.
    export function registerElementTools(server: McpServer, bridge: WebSocketBridge) {
      server.tool(
        'find_element',
        'Find an element by visible text, ARIA role, ARIA label, CSS selector, or XPath. Returns a unique CSS selector and element details. Use this when you need to locate an element but don\'t know its exact selector. Role matching supports both explicit role="..." attributes AND semantic HTML (e.g., <button> matches role "button", <a> matches role "link"). When multiple filters are provided (e.g., role + text), they are combined with AND logic.',
        {
          text: z.string().optional().describe('Find by visible text content (substring match). Can be combined with role for precise targeting.'),
          role: z.string().optional().describe('Find by ARIA role (e.g., "button", "link", "navigation", "main"). Falls back to semantic HTML roles if no explicit role attribute is present.'),
          ariaLabel: z.string().optional().describe('Find by aria-label attribute (substring match)'),
          css: z.string().optional().describe('CSS selector (most direct — bypasses all other filters)'),
          xpath: z.string().optional().describe('XPath expression (bypasses all other filters)'),
          nth: z.number().optional().describe('Select the Nth match (0-indexed), default: 0'),
          tabId: z.number().optional().describe('Target tab ID (defaults to currently active tab)'),
          apiKey: z.string().optional().describe('API key for authentication if enabled'),
        },
        async ({ text, role, ariaLabel, css, xpath, nth, tabId, apiKey }) => {
          const result = await bridge.sendCommand({
            command: 'find_element',
            params: { text, role, ariaLabel, css, xpath, nth },
            tabId,
            apiKey,
            timeout: LONG_TIMEOUT,
          });
          if (!result.success) {
            return { content: [{ type: 'text', text: `Error: ${result.error?.message}` }], isError: true };
          }
          return { content: [{ type: 'text', text: JSON.stringify(result.data, null, 2) }] };
        }
      );
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description explains that role matching supports both explicit ARIA roles and semantic HTML, and that CSS/XPath bypass other filters. This adds behavioral context beyond the schema. However, it does not describe error handling or what happens if no element is found.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each serving a clear purpose: stating the function, providing usage guidance, and detailing behavioral nuances. There is no fluff, and key information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 8 parameters, no output schema, and no annotations, the description covers the core purpose, key behaviors, and parameter interactions. It could mention the return format or that the returned selector can be used with other tools, but it is largely complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already describes parameters. The description adds significant value by explaining the semantic HTML fallback for role and the bypass behavior for css/xpath, which is not in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool finds an element by various methods (visible text, ARIA role, etc.) and returns a unique CSS selector and details. It distinguishes itself from sibling tools like click_element or hover_element that use the found element.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use this when you need to locate an element but don't know its exact selector.' It also explains AND logic for multiple filters. However, it does not mention when to prefer other tools like CSS/XPath alone or query_shadow_dom.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/BrowserGenie/mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server