Skip to main content
Glama

navigate

Navigate to a specified URL and capture a screenshot with interactive elements clearly labeled for automated web interaction and visual analysis.

Instructions

Navigate to a URL and return a screenshot with labeled interactive elements

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to navigate to

Implementation Reference

  • The core _navigate method implementation that navigates to a URL using Playwright and returns an observation with labeled interactive elements.
    def _navigate(self, url: str = None, **_) -> BrowserResult:
        """Navigate to URL and return observation"""
        if not url:
            return BrowserResult(success=False, error="URL required")
        
        self._ensure_browser()
        
        try:
            self._page.goto(url, timeout=30000, wait_until="domcontentloaded")
            self._page.wait_for_timeout(1500)
            return self._observe()
            
        except Exception as e:
            return BrowserResult(
                success=False,
                error=f"Navigation failed: {str(e)}"
            )
  • Tool registration for 'navigate' with schema definition requiring a 'url' parameter of type string.
    Tool(
        name="navigate",
        description="Navigate to a URL and return a screenshot with labeled interactive elements",
        inputSchema={
            "type": "object",
            "properties": {
                "url": {
                    "type": "string",
                    "description": "The URL to navigate to"
                }
            },
            "required": ["url"]
        }
    ),
  • The call_tool handler for 'navigate' that extracts the URL argument and calls browser.execute with action='navigate'.
    if name == "navigate":
        result = await asyncio.to_thread(
            browser.execute, 
            action="navigate", 
            url=arguments.get("url")
        )
  • The execute method that maps the 'navigate' action string to the _navigate method handler (line 77).
    def execute(self, action: str, **kwargs) -> BrowserResult:
        """Execute a browser action"""
        if not _playwright_available:
            return BrowserResult(
                success=False,
                error="Playwright not installed. Run: pip install playwright && playwright install chromium"
            )
        
        actions = {
            "navigate": self._navigate,
            "observe": self._observe,
            "click": self._click,
            "multi_click": self._multi_click,
            "type": self._type,
            "scroll": self._scroll,
            "close": self._close
        }
        
        handler = actions.get(action)
        if not handler:
            return BrowserResult(
                success=False,
                error=f"Unknown action: {action}. Available: {list(actions.keys())}"
            )
        
        try:
            return handler(**kwargs)
        except Exception as e:
            return BrowserResult(
                success=False,
                error=f"Browser error: {str(e)}"
            )
  • The _observe helper method called after navigation to capture screenshot and inject Set-of-Mark labels for interactive elements.
    def _observe(self, **_) -> BrowserResult:
        """Get visual observation of current page"""
        if self._page is None:
            return BrowserResult(success=False, error="No page open. Use navigate first.")
        
        try:
            self._page.wait_for_timeout(500)
            
            elements = self._page.evaluate(self.SOM_INJECT_SCRIPT)
            
            self._element_map = {}
            for el in elements:
                self._element_map[el['id']] = {
                    'x': el['x'],
                    'y': el['y'],
                    'width': el['width'],
                    'height': el['height'],
                    'tag': el['tag'],
                    'type': el['type'],
                    'text': el['text']
                }
            
            screenshot_bytes = self._page.screenshot(
                type="jpeg",
                quality=self.SCREENSHOT_QUALITY
            )
            screenshot_base64 = base64.b64encode(screenshot_bytes).decode('utf-8')
            
            elements_for_llm = []
            for el in elements:
                element_info = {
                    'id': el['id'],
                    'tag': el['tag'],
                }
                if el['text']:
                    element_info['text'] = el['text']
                if el['type']:
                    element_info['type'] = el['type']
                elements_for_llm.append(element_info)
            
            return BrowserResult(
                success=True,
                data={
                    'url': self._page.url,
                    'title': self._page.title(),
                    'screenshot': screenshot_base64,
                    'elements': elements_for_llm,
                    'element_count': len(elements)
                },
                metadata={'has_image': True}
            )
            
        except Exception as e:
            return BrowserResult(
                success=False,
                error=f"Observation failed: {str(e)}"
            )
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/LingTravel/Atlas-Browser'

If you have feedback or need assistance with the MCP directory API, please join our Discord server