Gemini Web Automation MCP

Overview Schema Related Servers Score Discussions

start_web_task

Launch background web browsing tasks that run asynchronously while you continue working. Use for research, price comparisons, or data collection that may take 30+ seconds, then monitor progress with check_web_task.

Instructions

Start a web browsing task in the background and return immediately.

Use this for tasks that might take a while (30+ seconds). The task runs
asynchronously while you continue working. Check progress with check_web_task().

Args:
    task: What you want to accomplish on the web
    url: Starting webpage (defaults to Google)

Returns:
    Dictionary containing:
    - ok: Boolean indicating task was started successfully
    - task_id: Unique ID to check progress later
    - status: Will be "running"
    - message: Instructions for checking progress

Examples:
    - start_web_task("Research top 10 AI companies and their products")
    - start_web_task("Find and compare prices for MacBook Pro on 5 different sites")

Next steps:
    Use check_web_task(task_id) to monitor progress.
    Wait at least 5 seconds between status checks.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`task`	Yes
`url`	No		https://www.google.com

Output Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Implementation Reference

server.py:145-199 (handler)

The primary handler for the 'start_web_task' tool. This function is registered via the @mcp.tool() decorator. It creates a background task using task_manager.create_task and initiates execution asynchronously via task_manager.start_task, returning a task_id immediately for later polling.

@mcp.tool()
async def start_web_task(task: str, url: str = "https://www.google.com") -> dict[str, Any]:
    """
    Start a web browsing task in the background and return immediately.

    Use this for tasks that might take a while (30+ seconds). The task runs
    asynchronously while you continue working. Check progress with check_web_task().

    Args:
        task: What you want to accomplish on the web
        url: Starting webpage (defaults to Google)

    Returns:
        Dictionary containing:
        - ok: Boolean indicating task was started successfully
        - task_id: Unique ID to check progress later
        - status: Will be "running"
        - message: Instructions for checking progress

    Examples:
        - start_web_task("Research top 10 AI companies and their products")
        - start_web_task("Find and compare prices for MacBook Pro on 5 different sites")

    Next steps:
        Use check_web_task(task_id) to monitor progress.
        Wait at least 5 seconds between status checks.
    """
    logger.info(f"Starting async web browsing task: {task}")

    # Create task
    task_id = task_manager.create_task(task, url)

    # Start task in background using anyio (FastMCP best practice)
    # Use anyio.to_thread.run_sync to run the blocking start_task in a thread
    # We await it but start_task itself just spawns the thread and returns immediately
    success = await anyio.to_thread.run_sync(
        task_manager.start_task,
        task_id,
        logger
    )

    if not success:
        return {
            "ok": False,
            "error": "Failed to start task"
        }

    logger.info(f"Task {task_id} started in background, returning immediately")

    return {
        "ok": True,
        "task_id": task_id,
        "status": "running",
        "message": f"Task started. Use check_web_task('{task_id}') to monitor progress."
    }

task_manager.py:100-116 (helper)

Helper method in BrowserTaskManager that creates a new BrowserTask instance and stores it, returning the unique task_id. Called by start_web_task.

def create_task(self, task_description: str, url: str = "https://www.google.com") -> str:
    """Create a new browser automation task.

    Args:
        task_description: Description of the browsing task
        url: Starting URL

    Returns:
        task_id: Unique identifier for the task
    """
    task_id = str(uuid.uuid4())
    task = BrowserTask(task_id, task_description, url)

    with self._lock:
        self.tasks[task_id] = task

    return task_id

task_manager.py:118-148 (helper)

Helper method that transitions the task to RUNNING status and spawns a daemon thread to execute _execute_task in the background. Called by start_web_task.

def start_task(self, task_id: str, logger=None) -> bool:
    """Start executing a task in the background.

    Args:
        task_id: Task identifier
        logger: Optional logger instance

    Returns:
        True if task started, False if task not found or already running
    """
    with self._lock:
        task = self.tasks.get(task_id)
        if not task or task.status != TaskStatus.PENDING:
            return False

        task.status = TaskStatus.RUNNING
        task.started_at = datetime.now(timezone.utc).isoformat()

    # Run task in background thread (don't store reference)
    thread = threading.Thread(
        target=self._execute_task,
        args=(task_id, logger),
        daemon=True,
        name=f"BrowserTask-{task_id[:8]}"
    )
    thread.start()

    if logger:
        logger.info(f"Started background thread for task {task_id[:8]}... Thread: {thread.name}")

    return True

task_manager.py:150-186 (helper)

Private helper method run in background thread that instantiates GeminiBrowserAgent and calls its execute_task method to perform the actual web browsing. Handles completion, error, and cleanup.

def _execute_task(self, task_id: str, logger=None):
    """Execute the browser automation task (runs in background thread)."""
    if logger:
        logger.info(f"[Thread {threading.current_thread().name}] Starting execution for task {task_id[:8]}...")

    with self._lock:
        task = self.tasks.get(task_id)
        if not task:
            if logger:
                logger.error(f"Task {task_id} not found in _execute_task")
            return

    try:
        # Create browser agent
        agent = GeminiBrowserAgent(logger=logger)
        task.agent = agent

        # Execute the task
        result = agent.execute_task(task.task_description, task.url)

        with self._lock:
            task.result = result
            task.progress_updates = agent.progress_updates.copy()
            task.status = TaskStatus.COMPLETED
            task.completed_at = datetime.now(timezone.utc).isoformat()

    except Exception as e:
        with self._lock:
            task.error = str(e)
            task.status = TaskStatus.FAILED
            task.completed_at = datetime.now(timezone.utc).isoformat()

    finally:
        # Clean up browser
        if task.agent:
            task.agent.cleanup_browser()

Tool Definition Quality

A4.7/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key traits: the task runs asynchronously in the background, returns immediately, requires monitoring with check_web_task, and has a default URL. However, it doesn't mention potential errors, timeouts, or resource limits, leaving some behavioral aspects uncovered.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and appropriately sized, with clear sections (purpose, args, returns, examples, next steps). Every sentence adds value, such as explaining asynchronous behavior, providing usage examples, and outlining follow-up steps, with no redundant or wasted content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (asynchronous operation with 2 parameters) and the presence of an output schema (which covers return values), the description is complete. It explains the tool's purpose, usage, parameters, and next steps adequately, compensating for the lack of annotations and low schema coverage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, so the description must compensate. It adds meaningful context for both parameters: 'task' is described as 'What you want to accomplish on the web' with examples, and 'url' is clarified as the 'Starting webpage (defaults to Google)'. This goes beyond the basic schema types, though it could provide more detail on URL formatting or task constraints.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Start a web browsing task in the background') and resource ('web browsing task'), distinguishing it from siblings like browse_web (likely synchronous) and check_web_task (monitoring). It explicitly mentions returning immediately and running asynchronously, which differentiates its purpose from other tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('for tasks that might take a while (30+ seconds)'), when not to use it (implied for shorter tasks), and alternatives (check_web_task for monitoring progress). It also specifies prerequisites like waiting 5 seconds between checks, making usage context clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/vincenthopf/computer-use-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server