Browser Use Server
by ztobs
# Browser Use Server
[](https://smithery.ai/server/@ztobs/cline-browser-use-mcp)
A Model Context Protocol server for browser automation using Python scripts. For use with Cline
<a href="https://glama.ai/mcp/servers/0aqrsbhx3z"><img width="380" height="200" src="https://glama.ai/mcp/servers/0aqrsbhx3z/badge" alt="Browser Use Server MCP server" /></a>
## Features
### Browser Operations
- `screenshot`: Capture a screenshot of a webpage (full page or viewport)
- `get_html`: Retrieve the HTML content of a webpage
- `execute_js`: Execute JavaScript on a webpage
- `get_console_logs`: Get console logs from a webpage
All operations support custom interaction steps (e.g., clicking elements, scrolling) after page load.
## Prerequisites
1. (Optional but recommended) Install Xvfb for headless browser automation:
```bash
# Ubuntu/Debian
sudo apt-get install xvfb
# CentOS/RHEL
sudo yum install xorg-x11-server-Xvfb
# Arch Linux
sudo pacman -S xorg-server-xvfb
```
Xvfb (X Virtual Frame Buffer) creates a virtual display, allowing browser automation without detection as a bot. Learn more about Xvfb [here](https://www.x.org/releases/X11R7.6/doc/man/man1/Xvfb.1.xhtml).
2. Install Miniconda or Anaconda
3. Create a Conda environment:
```bash
conda create -n browser-use python=3.11
conda activate browser-use
pip install browser-use
```
4. Set up LLM configuration:
The server supports multiple LLM providers. You can use any of the following API keys:
```bash
# Required: Set at least one of these API keys
export GLHF_API_KEY=your_api_key
export GROQ_API_KEY=your_api_key
export OPENAI_API_KEY=your_api_key
export OPENROUTER_API_KEY=your_api_key
export GITHUB_API_KEY=your_api_key
export DEEPSEEK_API_KEY=your_api_key
export GEMINI_API_KEY=your_api_key
export OLLAMA_API_KEY=your_api_key
# Optional: Override default configuration
export MODEL=your_preferred_model # Override the default model
export BASE_URL=your_custom_url # Override the default API endpoint
export USE_VISION=false # Enable/disable vision capabilities (default: false)
```
The server will automatically use the first available API key it finds. You can optionally customize the model and base URL for any provider using the environment variables.
## Installation
### Installing via Smithery
To install Browser Use Server for Claude Desktop automatically via [Smithery](https://smithery.ai/server/@ztobs/cline-browser-use-mcp):
```bash
npx -y @smithery/cli install @ztobs/cline-browser-use-mcp --client claude
```
1. Clone this repository
2. Install dependencies:
```bash
npm install
```
3. Build the server:
```bash
npm run build
```
## MCP Configuration
Add the following configuration to your Cline MCP settings:
```json
"browser-use": {
"command": "node",
"args": [
"/home/YOUR_HOME/Documents/Cline/MCP/browser-use-server/build/index.js"
],
"env": {
// Required: Set at least one API key
"GLHF_API_KEY": "your_api_key",
"GROQ_API_KEY": "your_api_key",
"OPENAI_API_KEY": "your_api_key",
"OPENROUTER_API_KEY": "your_api_key",
"GITHUB_API_KEY": "your_api_key",
"DEEPSEEK_API_KEY": "your_api_key",
"GEMINI_API_KEY": "your_api_key",
"OLLAMA_API_KEY": "your_api_key",
// Optional: Configuration overrides
"MODEL": "your_preferred_model",
"BASE_URL": "your_custom_url",
"USE_VISION": "false"
},
"disabled": false,
"autoApprove": []
}
```
Replace:
- `YOUR_HOME` with your actual home directory name
- `your_api_key` with your actual API keys
## Usage
Run the server:
```bash
node build/index.js
```
The server will be available on stdio and supports the following operations:
### Screenshot
Parameters:
- url: The webpage URL (required)
- full_page: Whether to capture the full page or just the viewport (optional, default: false)
- steps: Comma-separated actions or sentences describing steps to take after page load (optional)
### Get HTML
Parameters:
- url: The webpage URL (required)
- steps: Comma-separated actions or sentences describing steps to take after page load (optional)
### Execute JavaScript
Parameters:
- url: The webpage URL (required)
- script: JavaScript code to execute (required)
- steps: Comma-separated actions or sentences describing steps to take after page load (optional)
### Get Console Logs
Parameters:
- url: The webpage URL (required)
- steps: Comma-separated actions or sentences describing steps to take after page load (optional)
## Example Cline Usage
Here are some example tasks you can accomplish using the browser-use server with Cline:
### Modifying Web Page Elements during Development
To change the color of a heading on a page that requires authentication:
```
Change the colour of the headline with the text "Alle Foren im Überblick." to deep blue on https://localhost:3000/foren/ page
To check/see the page, use browser-use MCP server to:
Open https://localhost:3000/auth,
Login with ztobs:Password123,
Navigate to https://localhost:3000/foren/,
Accept cookies if required
hint: execute all browser actions in one command with multiple comma-separated steps
```
This task demonstrates:
- Multi-step browser automation using comma-separated steps
- Authentication handling
- Cookie acceptance
- DOM manipulation
- CSS styling changes
The server will execute these steps sequentially, handling any required interactions along the way.
## Configuration
### LLM Configuration
The server supports multiple LLM providers with their default configurations:
- GLHF: Uses deepseek-ai/DeepSeek-V3 model
- Ollama: Uses qwen2.5:32b-instruct-q4_K_M model with 32k context window
- Groq: Uses deepseek-r1-distill-llama-70b model
- OpenAI: Uses gpt-4o-mini model
- Openrouter: Uses deepseek/deepseek-chat model
- Github: Uses gpt-4o-mini model
- DeepSeek: Uses deepseek-chat model
- Gemini: Uses gemini-2.0-flash-exp model
You can override these defaults using environment variables:
- `MODEL`: Set a custom model name for any provider
- `BASE_URL`: Set a custom API endpoint URL (if the provider supports it)
### Vision Support
The server supports vision capabilities through the USE_VISION environment variable:
- Set USE_VISION=true to enable vision capabilities for browser operations
- Default is false to optimize performance when vision is not needed
- Useful for tasks that require visual understanding of webpage content
### Xvfb Support
The server automatically detects if Xvfb is installed and:
- Uses xvfb-run when available, enabling better browser automation without bot detection
- Falls back to direct execution when Xvfb is not installed
- Sets RUNNING_UNDER_XVFB environment variable accordingly
### Timeout
Default timeout is 5 minutes (300000 ms). Modify the TIMEOUT constant in `build/index.js` to change this.
## Error Handling
The server provides detailed error messages for:
- Python script execution failures
- Browser operation timeouts
- Invalid parameters
## Debugging
Use the MCP Inspector for debugging:
```bash
npm run inspector
```
## Citation
```
@software{browser_use2024,
author = {Müller, Magnus and Žunič, Gregor},
title = {Browser Use: Enable AI to control your browser},
year = {2024},
publisher = {GitHub},
url = {https://github.com/browser-use/browser-use}
}
```
## License
MIT