Skip to main content
Glama

🌐 atlas-browser-mcp

Visual web browsing for AI agents via Model Context Protocol (MCP).

PyPI version License: MIT

✨ Features

  • πŸ“Έ Visual-First: Navigate the web through screenshots, not DOM parsing

  • 🏷️ Set-of-Mark: Interactive elements labeled with clickable [0], [1], [2]... markers

  • 🎭 Humanized: Bezier curve mouse movements, natural typing rhythms

  • 🧩 CAPTCHA-Ready: Multi-click support for image selection challenges

  • πŸ›‘οΈ Anti-Detection: Built-in measures to avoid bot detection

πŸš€ Quick Start

Installation

pip install atlas-browser-mcp
playwright install chromium

Use with Claude Desktop

Add to your Claude Desktop config (claude_desktop_config.json):

{
  "mcpServers": {
    "browser": {
      "command": "atlas-browser-mcp"
    }
  }
}

Then ask Claude:

"Navigate to https://news.ycombinator.com and tell me the top 3 stories"

πŸ› οΈ Available Tools

Tool

Description

navigate

Go to URL, returns labeled screenshot

screenshot

Capture current page with labels

click

Click element by label ID [N]

multi_click

Click multiple elements (for CAPTCHA)

type

Type text, optionally press Enter

scroll

Scroll page up or down

πŸ“– Usage Examples

Basic Navigation

User: Go to google.com
AI: [calls navigate(url="https://google.com")]
AI: I see the Google homepage. The search box is labeled [3].

User: Search for "MCP protocol"
AI: [calls click(label_id=3)]
AI: [calls type(text="MCP protocol", submit=true)]
AI: Here are the search results...

CAPTCHA Handling

User: Select all images with traffic lights
AI: [Looking at the CAPTCHA grid]
AI: I can see traffic lights in images [2], [5], and [8].
AI: [calls multi_click(label_ids=[2, 5, 8])]

πŸ”§ Configuration

Headless Mode

For servers without display:

from atlas_browser_mcp.browser import VisualBrowser

browser = VisualBrowser(
    headless=True,   # No visible browser window
    humanize=False   # Faster, less human-like
)

Custom Viewport

browser = VisualBrowser()
browser.VIEWPORT = {"width": 1920, "height": 1080}

πŸ—οΈ How It Works

  1. Navigate: Browser loads the page

  2. Inject SoM: JavaScript labels all interactive elements

  3. Screenshot: Capture the labeled page

  4. AI Sees: The screenshot shows [0], [1], [2]... on buttons, links, inputs

  5. AI Acts: "Click [5]" β†’ Browser clicks the element at that position

  6. Repeat: New screenshot with updated labels

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  [0] Logo    [1] Search   [2] Menu  β”‚
β”‚                                     β”‚
β”‚  [3] Article Title                  β”‚
β”‚  [4] Read More                      β”‚
β”‚                                     β”‚
β”‚  [5] Subscribe    [6] Share         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🀝 Integration

With Cline (VS Code)

{
  "mcpServers": {
    "browser": {
      "command": "atlas-browser-mcp"
    }
  }
}

Programmatic Use

from atlas_browser_mcp.browser import VisualBrowser

browser = VisualBrowser()

# Navigate
result = browser.execute("navigate", url="https://example.com")
print(f"Page title: {result.data['title']}")
print(f"Found {result.data['element_count']} interactive elements")

# Click element [0]
result = browser.execute("click", label_id=0)

# Type in focused field
result = browser.execute("type", text="Hello world", submit=True)

# Cleanup
browser.execute("close")

πŸ“‹ Requirements

  • Python 3.10+

  • Playwright with Chromium

πŸ› Troubleshooting

"Playwright not installed"

pip install playwright
playwright install chromium

"Browser closed unexpectedly"

Try running with headless=False to see what's happening:

browser = VisualBrowser(headless=False)

Elements not being detected

Some dynamic pages need more wait time. The browser waits 1.5s after navigation, but complex SPAs may need longer.

πŸ“„ License

MIT License - see LICENSE

πŸ™ Credits

Built for Atlas, an autonomous AI agent.

Inspired by:

Install Server
A
security – no known vulnerabilities
A
license - permissive license
A
quality - confirmed to work

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/LingTravel/Atlas-Browser'

If you have feedback or need assistance with the MCP directory API, please join our Discord server