Skip to main content
Glama
ashley-ha

MCP Browser Agent

by ashley-ha

execute_actions

Execute planned browser automation actions from the MCP Browser Agent's planner state to interact with web pages, navigate sites, and manipulate elements.

Instructions

Execute actions from the planner state.

Args:
    actions: A dictionary containing the planner state and actions in format:
            {
                "current_state": {
                    "evaluation_previous_goal": str,
                    "memory": str,
                    "next_goal": str
                },
                "action": [
                    {"action_name": {"param1": "value1"}},
                    ...
                ]
            }
            
Note: If the page state changes (new elements appear) during action execution,
the sequence will be interrupted and you'll need to get a new planner state.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
actionsYes

Implementation Reference

  • The execute_actions tool handler, registered via @mcp.tool(). Validates the actions input dictionary, converts actions to models using Controller.registry, executes them sequentially with DOM change detection to prevent stale selectors, and returns execution results or errors.
    @mcp.tool()
    async def execute_actions(actions: Dict[str, Any], ctx: Context) -> str:
        """Execute actions from the planner state.
        
        Args:
            actions: A dictionary containing the planner state and actions in format:
                    {
                        "current_state": {
                            "evaluation_previous_goal": str,
                            "memory": str,
                            "next_goal": str
                        },
                        "action": [
                            {"action_name": {"param1": "value1"}},
                            ...
                        ]
                    }
                    
        Note: If the page state changes (new elements appear) during action execution,
        the sequence will be interrupted and you'll need to get a new planner state.
        """
        browser_context = await browser_initialized_check()
        controller = ctx.request_context.lifespan_context["controller"]
        
        try:
            # Validate input format
            if not isinstance(actions, dict) or "action" not in actions:
                return "Error: Actions must be a dictionary containing 'action' list"
            
            action_list = actions["action"]
            if not action_list:
                return "No actions to execute"
            
            # Get initial state for DOM change detection
            initial_state = await browser_context.get_state()
            initial_path_hashes = set(e.hash.branch_path_hash for e in initial_state.selector_map.values())
            
            # Convert system prompt action format to action models
            action_models = []
            for action_dict in action_list:
                if not isinstance(action_dict, dict) or len(action_dict) != 1:
                    return "Error: Each action must be a dictionary with exactly one key-value pair"
                    
                action_name = list(action_dict.keys())[0]
                params = action_dict[action_name]
                
                # Create action model using the controller's registry
                action_model = controller.registry.create_action_model()(**{action_name: params})
                action_models.append(action_model)
            
            # Execute actions one by one to check for DOM changes
            results = []
            for i, action_model in enumerate(action_models):
                # Execute single action
                result = await controller.act(action_model, browser_context)
                results.append(result)
                
                # Check if this action requires element interaction
                requires_elements = any(param in str(action_model) for param in ["index", "xpath"])
                
                # If not the last action and next action might need elements, check for DOM changes
                if i < len(action_models) - 1:
                    new_state = await browser_context.get_state()
                    new_path_hashes = set(e.hash.branch_path_hash for e in new_state.selector_map.values())
                    
                    # If DOM changed and next action needs elements, break sequence
                    if requires_elements and not new_path_hashes.issubset(initial_path_hashes):
                        msg = f"Page state changed after action {i + 1}/{len(action_models)}. Please get new planner state before continuing."
                        logger.info(msg)
                        results.append(ActionResult(extracted_content=msg, include_in_memory=True))
                        break
                
                # Stop if there was an error
                if result.error:
                    break
            
            # Process results
            output = []
            for result in results:
                if result.extracted_content:
                    output.append(result.extracted_content)
                elif result.error:
                    output.append(f"Error: {result.error}")
                else:
                    output.append("Action executed successfully")
                    
            return "\n".join(output)
        except Exception as e:
            logger.error(f"Error executing actions: {str(e)}")
            return f"Error executing actions: {str(e)}"
  • Docstring defining the expected input schema for the actions parameter, including structure for current_state and action list.
    """Execute actions from the planner state.
    
    Args:
        actions: A dictionary containing the planner state and actions in format:
                {
                    "current_state": {
                        "evaluation_previous_goal": str,
                        "memory": str,
                        "next_goal": str
                    },
                    "action": [
                        {"action_name": {"param1": "value1"}},
                        ...
                    ]
                }
                
    Note: If the page state changes (new elements appear) during action execution,
    the sequence will be interrupted and you'll need to get a new planner state.
    """
  • browser-use.py:210-210 (registration)
    MCP tool registration decorator for execute_actions.
    @mcp.tool()
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses that execution can be interrupted by page state changes, which is a key behavioral trait, but doesn't cover other aspects like error handling, side effects, or response format. It adds some context but is incomplete for a mutation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded with the main purpose, followed by an 'Args' section and a note. The structure is clear, but the note could be more integrated; overall, it's efficient with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (1 parameter with nested objects, no annotations, no output schema), the description covers the parameter structure well and includes a behavioral note. However, it lacks details on return values, error cases, and full usage context, making it adequate but with gaps for a tool that likely performs mutations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, so the description must compensate. It provides a detailed example of the 'actions' parameter structure, including nested objects and keys like 'current_state' and 'action', which adds significant meaning beyond the schema's generic 'object' type. This effectively documents the parameter semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool 'Execute actions from the planner state', which provides a verb ('Execute') and resource ('actions from the planner state'), but it's vague about what 'actions' specifically entail (e.g., UI interactions, API calls) and doesn't clearly distinguish from the sibling tool 'get_planner_state'. It's not tautological but lacks specificity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes a note about interruption when 'the page state changes', which implies a usage context (e.g., web automation), but it doesn't explicitly state when to use this tool versus alternatives like 'get_planner_state' or provide prerequisites. The guidance is minimal and not comprehensive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ashley-ha/mcp-manus'

If you have feedback or need assistance with the MCP directory API, please join our Discord server