Server Configuration
Describes the environment variables required to run the server.
Name | Required | Description | Default |
---|---|---|---|
HEADLESS | No | Set to 'true' for faster headless mode | false |
GEMINI_MODEL | No | Gemini model to use | gemini-2.5-computer-use-preview-10-2025 |
SCREEN_WIDTH | No | Browser screen width (recommended by Google) | 1440 |
SCREEN_HEIGHT | No | Browser screen height (recommended by Google) | 900 |
GEMINI_API_KEY | Yes | Your Gemini API key (required by Google) | |
SCREENSHOT_OUTPUT_DIR | No | Directory to save screenshots | output_screenshots |
Schema
Prompts
Interactive templates invoked by user choice
Name | Description |
---|---|
No prompts |
Resources
Contextual data attached and managed by the client
Name | Description |
---|---|
No resources |
Tools
Functions exposed to the LLM to take actions
Name | Description |
---|---|
browse_web | Browse the web to complete a task using AI-powered browser automation.
The AI agent can navigate websites, click buttons, fill forms, search for information,
and interact with web pages just like a human user. This runs synchronously and returns
when the task is complete.
Args:
task: What you want to accomplish (e.g., "Find the top 3 gaming laptops on Amazon")
url: Starting webpage (defaults to Google)
Returns:
Dictionary containing:
- ok: Boolean indicating success
- data: Task completion message with results
- screenshot_dir: Path to saved screenshots
- session_id: Unique session identifier
- progress: List of actions taken during browsing
- error: Error message (if task failed)
Examples:
- "Search for Python tutorials and summarize the top result"
- "Go to example.com and click the login button"
- "Find product reviews for iPhone 15 Pro"
Note: For long-running tasks, consider using start_web_task instead. |
get_web_screenshots | Retrieve screenshots captured during a web browsing session.
Each browsing session saves screenshots of the pages visited. Use this to
review what the AI agent saw and did during task execution.
Args:
session_id: Session ID returned from browse_web or check_web_task
Returns:
Dictionary containing:
- ok: Boolean indicating success
- screenshots: List of screenshot file paths
- session_id: The session identifier
- count: Number of screenshots found
- error: Error message (if session not found)
Example:
get_web_screenshots("20251017_143022_a1b2c3d4") |
start_web_task | Start a web browsing task in the background and return immediately.
Use this for tasks that might take a while (30+ seconds). The task runs
asynchronously while you continue working. Check progress with check_web_task().
Args:
task: What you want to accomplish on the web
url: Starting webpage (defaults to Google)
Returns:
Dictionary containing:
- ok: Boolean indicating task was started successfully
- task_id: Unique ID to check progress later
- status: Will be "running"
- message: Instructions for checking progress
Examples:
- start_web_task("Research top 10 AI companies and their products")
- start_web_task("Find and compare prices for MacBook Pro on 5 different sites")
Next steps:
Use check_web_task(task_id) to monitor progress.
Wait at least 5 seconds between status checks. |
check_web_task | Check progress of a background web browsing task.
Returns a summary of task progress. By default, returns compact format to
avoid filling your context window with verbose progress logs.
IMPORTANT: To prevent context bloat, wait at least 3-5 seconds between
checks. Use the 'recommended_poll_after' timestamp as guidance.
Args:
task_id: Task ID from start_web_task()
compact: Return summary only (default: True). Set to False for full details.
Returns:
Dictionary containing:
- ok: Boolean indicating success
- task_id: Task identifier
- status: "pending", "running", "completed", "failed", or "cancelled"
- progress_summary: Recent actions (compact mode only)
- progress: Full action history (full mode only)
- result: Task results (when completed)
- error: Error message (when failed)
- recommended_poll_after: Timestamp to check again (when running)
- polling_guidance: Message about polling frequency
Examples:
- check_web_task("abc-123-def") # Compact summary
- check_web_task("abc-123-def", compact=False) # Full details
Best Practice:
Only poll every 3-5 seconds to keep your context window clean.
Use the wait() tool to pause between checks if your platform doesn't
support automatic delays.
Recommended workflow:
1. start_web_task("...")
2. wait(5)
3. check_web_task(task_id)
4. If still running, repeat steps 2-3 |
stop_web_task | Stop a running web browsing task.
Immediately halts task execution and cleans up browser resources. Use this
when you need to cancel a long-running task that's no longer needed.
Args:
task_id: Task ID from start_web_task()
Returns:
Dictionary containing:
- ok: Boolean indicating success
- message: Confirmation message
- task_id: The stopped task ID
- error: Error message (if task not found or already completed)
Examples:
- stop_web_task("abc-123-def")
Note: Cannot stop tasks that are already completed or failed. |
wait | Wait for a specified number of seconds before continuing.
Use this when you need to pause between operations, such as:
- Waiting between status checks to avoid rapid polling
- Giving a web task time to make progress
- Rate limiting your requests
- Waiting for external processes to complete
Args:
seconds: Number of seconds to wait (1-60)
Returns:
Dictionary containing:
- ok: Boolean indicating success
- waited_seconds: How long the wait lasted
- message: Confirmation message
Examples:
- wait(5) # Wait 5 seconds
- wait(10) # Wait 10 seconds
Best Practice:
Use this instead of immediately polling check_web_task multiple times.
Recommended wait time between status checks: 3-5 seconds.
Note: Maximum wait time is 60 seconds to prevent timeout issues. |
list_web_tasks | List all web browsing tasks, including active and completed ones.
Shows a summary of all tasks in the current session. Useful for tracking
multiple concurrent browsing operations.
Returns:
Dictionary containing:
- ok: Boolean indicating success
- tasks: Array of task status objects (compact format)
- count: Total number of tasks
- active_count: Number of currently running tasks
Examples:
- list_web_tasks()
Note: Returns compact task summaries. Use check_web_task(task_id) for details. |