Skip to main content
Glama

click

Click labeled elements in web screenshots to enable AI agents to navigate websites visually. This tool uses numeric labels for precise interaction with page elements.

Instructions

Click on an element by its label ID (shown as [N] on the screenshot)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
label_idYesThe numeric label shown on the element (e.g., 5 for [5])

Implementation Reference

  • Main handler implementation for the click tool. Validates the label_id, retrieves element coordinates from the element_map, removes visual labels, performs a humanized click using _human_click_at, waits for page load, and returns a new observation via _observe().
    def _click(self, label_id: int = None, **_) -> BrowserResult:
        """Click element by label ID"""
        if label_id is None:
            return BrowserResult(success=False, error="label_id required")
        
        if label_id not in self._element_map:
            available = list(self._element_map.keys())[:10]
            return BrowserResult(
                success=False, 
                error=f"Label [{label_id}] not found. Available: {available}..."
            )
        
        if self._page is None:
            return BrowserResult(success=False, error="No page open")
        
        element = self._element_map[label_id]
        
        try:
            self._page.evaluate("() => document.querySelectorAll('.atlas-som-label').forEach(el => el.remove())")
            
            self._human_click_at(
                element['x'], 
                element['y'], 
                element['width'], 
                element['height']
            )
            
            try:
                self._page.wait_for_load_state("networkidle", timeout=3000)
            except:
                pass
            
            self._page.wait_for_timeout(500)
            return self._observe()
            
        except Exception as e:
            return BrowserResult(
                success=False,
                error=f"Click failed: {str(e)}"
            )
  • Tool schema definition for the click tool. Defines the tool name, description, and input schema with a required label_id parameter (integer type).
    Tool(
        name="click",
        description="Click on an element by its label ID (shown as [N] on the screenshot)",
        inputSchema={
            "type": "object",
            "properties": {
                "label_id": {
                    "type": "integer",
                    "description": "The numeric label shown on the element (e.g., 5 for [5])"
                }
            },
            "required": ["label_id"]
        }
  • Tool registration where the click action is mapped to the browser.execute method with the appropriate action parameter and label_id argument.
    elif name == "click":
        result = await asyncio.to_thread(
            browser.execute,
            action="click",
            label_id=arguments.get("label_id")
        )
  • Action handler mapping where the "click" action string is mapped to the _click method implementation.
    actions = {
        "navigate": self._navigate,
        "observe": self._observe,
        "click": self._click,
        "multi_click": self._multi_click,
        "type": self._type,
        "scroll": self._scroll,
        "close": self._close
    }
  • Helper function _human_click_at that performs humanized clicking with random offsets from the center position, human-like mouse movement trajectories, and realistic timing delays between mouse down/up actions.
    def _human_click_at(self, x: int, y: int, width: int, height: int):
        """Humanized click at position"""
        if self._humanize:
            max_offset_x = min(10, width * 0.15)
            max_offset_y = min(10, height * 0.15)
            offset_x = random.uniform(-max_offset_x, max_offset_x)
            offset_y = random.uniform(-max_offset_y, max_offset_y)
        else:
            offset_x = 0
            offset_y = 0
        
        target_x = int(x + offset_x)
        target_y = int(y + offset_y)
        
        self._human_move((target_x, target_y))
        
        if self._humanize:
            time.sleep(random.uniform(0.1, 0.3))
        
        self._page.mouse.down()
        if self._humanize:
            time.sleep(random.uniform(0.05, 0.12))
        self._page.mouse.up()
        
        if self._humanize:
            time.sleep(random.uniform(0.1, 0.25))
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/LingTravel/Atlas-Browser'

If you have feedback or need assistance with the MCP directory API, please join our Discord server