---
title: "Agent Settings"
description: "Learn how to configure the agent"
icon: "gear"
---
## Overview
The `Agent` class is the core component of Browser Use that handles browser automation. Here are the main configuration options you can use when initializing an agent.
## Basic Settings
```python
from browser_use import Agent
from browser_use.llm import ChatOpenAI
agent = Agent(
task="Search for latest news about AI",
llm=ChatOpenAI(model="gpt-4o"),
)
```
### Required Parameters
- `task`: The instruction for the agent to execute
- `llm`: A chat model instance. See <a href="/customize/supported-models">Supported Models</a> for supported models.
## Agent Behavior
Control how the agent operates:
```python
agent = Agent(
task="your task",
llm=llm,
controller=custom_controller, # For custom tool calling
use_vision=True, # Enable vision capabilities
save_conversation_path="logs/conversation" # Save chat logs
)
```
### Behavior Parameters
- `controller`: Registry of functions the agent can call. Defaults to base Controller. See <a href="/customize/custom-functions">Custom Functions</a> for details.
- `use_vision`: Enable/disable vision capabilities. Defaults to `True`.
- When enabled, the model processes visual information from web pages
- Disable to reduce costs or use models without vision support
- For GPT-4o, image processing costs approximately 800-1000 tokens (~$0.002 USD) per image (but this depends on the defined screen size)
- `vision_detail_level`: Controls the detail level of screenshots sent to the vision model. Can be `'low'`, `'high'`, or `'auto'` (default). Using `'low'` can significantly reduce token consumption and cost for simpler visual tasks, while `'high'` provides more detail for complex visual analysis.
- `save_conversation_path`: Path to save the complete conversation history. Useful for debugging.
- `override_system_message`: Completely replace the default system prompt with a custom one.
- `extend_system_message`: Add additional instructions to the default system prompt.
<Note>
Vision capabilities are recommended for better web interaction understanding,
but can be disabled to reduce costs or when using models without vision
support.
</Note>
### Reuse Existing Browser Context
By default browser-use launches its own builtin browser using playwright chromium.
You can also connect to a remote browser or pass any of the following
existing playwright objects to the Agent: `page`, `browser_context`, `browser`, `browser_session`, or `browser_profile`.
These all get passed down to create a `BrowserSession` for the `Agent`:
```python
agent = Agent(
task='book a flight to fiji',
llm=llm,
browser_profile=browser_profile, # use this profile to create a BrowserSession
browser_session=BrowserSession( # use an existing BrowserSession
cdp_url=..., # remote CDP browser to connect to
# or
wss_url=..., # remote wss playwright server provider
# or
browser_pid=... # pid of a locally running browser process to attach to
# or
executable_path=... # provide a custom chrome binary path
# or
channel=... # specify chrome, chromium, ms-edge, etc.
# or
page=page, # use an existing playwright Page object
# or
browser_context=browser_context, # use an existing playwright BrowserContext object
# or
browser=browser, # use an existing playwright Browser object
),
)
```
For example, to connect to an existing browser over CDP you could do:
```python
agent = Agent(
...
browser_session=BrowserSession(cdp_url='http://localhost:9222'),
)
```
For example, to connect to a local running chrome instance you can do:
```python
agent = Agent(
...
browser_session=BrowserSession(browser_pid=1234),
)
```
See <a href="/customize/real-browser">Connect to your Browser</a> for more info.
<Note>
You can reuse the same `BrowserSession` after an agent has completed running.
If you do nothing, the browser will be automatically closed on `run()`
completion only if it was launched by us.
</Note>
## Running the Agent
The agent is executed using the async `run()` method:
- `max_steps` (default: `100`)
Maximum number of steps the agent can take during execution. This prevents infinite loops and helps control execution time.
## Agent History
The method returns an `AgentHistoryList` object containing the complete execution history. This history is invaluable for debugging, analysis, and creating reproducible scripts.
```python
# Example of accessing history
history = await agent.run()
# Access (some) useful information
history.urls() # List of visited URLs
history.screenshot_paths() # List of screenshot paths
history.action_names() # Names of executed actions
history.extracted_content() # Content extracted during execution
history.errors() # Any errors that occurred
history.model_actions() # All actions with their parameters
```
The `AgentHistoryList` provides many helper methods to analyze the execution:
- `final_result()`: Get the final extracted content
- `is_done()`: Check if the agent completed successfully
- `has_errors()`: Check if any errors occurred
- `model_thoughts()`: Get the agent's reasoning process
- `action_results()`: Get results of all actions
<Note>
For a complete list of helper methods and detailed history analysis
capabilities, refer to the [AgentHistoryList source
code](https://github.com/browser-use/browser-use/blob/main/browser_use/agent/views.py#L111).
</Note>
## Run initial actions without LLM
With [this example](https://github.com/browser-use/browser-use/blob/main/examples/features/initial_actions.py) you can run initial actions without the LLM.
Specify the action as a dictionary where the key is the action name and the value is the action parameters. You can find all our actions in the [Controller](https://github.com/browser-use/browser-use/blob/main/browser_use/controller/service.py) source code.
```python
initial_actions = [
{'go_to_url': {'url': 'https://www.google.com', 'new_tab': True}},
{'go_to_url': {'url': 'https://en.wikipedia.org/wiki/Randomness', 'new_tab': True}},
{'scroll_down': {'amount': 1000}},
]
agent = Agent(
task='What theories are displayed on the page?',
initial_actions=initial_actions,
llm=llm,
)
```
### Optional Parameters
- `initial_actions`: List of initial actions to run before the main task.
- `max_actions_per_step`: Maximum number of actions to run in a step. Defaults to `10`.
- `max_failures`: Maximum number of failures before giving up. Defaults to `3`.
- `retry_delay`: Time to wait between retries in seconds when rate limited. Defaults to `10`.
- `generate_gif`: Enable/disable GIF generation. Defaults to `False`. Set to `True` or a string path to save the GIF.
## Memory
Memory management in browser-use has been significantly improved since version 0.3.2. The agent's context handling and state management are now robust enough that the previous memory system (`mem0`) is no longer needed or supported.
The agent maintains its context and task progress through:
- Detailed history tracking of actions and results
- Structured state management
- Clear goal setting and evaluation at each step
The `enable_memory` parameter has been removed as the new system provides better context management by default.
<Note>
If you're upgrading from an older version that used `enable_memory`, simply remove this parameter. The agent will automatically use the improved context management system.
</Note>