Skip to main content
Glama

browserbase_stagehand_act

Execute specific web page actions like clicking buttons or typing text through automated browser control for precise web automation tasks.

Instructions

Performs an action on a web page element. Act actions should be as atomic and specific as possible, i.e. "Click the sign in button" or "Type 'hello' into the search input". AVOID actions that are more than one step, i.e. "Order me pizza" or "Send an email to Paul asking him to call me".

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
actionYesThe action to perform. Should be as atomic and specific as possible, i.e. 'Click the sign in button' or 'Type 'hello' into the search input'. AVOID actions that are more than one step, i.e. 'Order me pizza' or 'Send an email to Paul asking him to call me'. The instruction should be just as specific as possible, and have a strong correlation to the text on the page. If unsure, use observe before using act.
variablesNoVariables used in the action template. ONLY use variables if you're dealing with sensitive data or dynamic content. For example, if you're logging in to a website, you can use a variable for the password. When using variables, you MUST have the variable key in the action template. For example: {"action": "Fill in the password", "variables": {"password": "123456"}}

Implementation Reference

  • The handler function `handleAct` that performs the core logic of the tool by invoking `stagehand.page.act()` with the action and optional variables.
    async function handleAct(
      context: Context,
      params: ActInput,
    ): Promise<ToolResult> {
      const action = async (): Promise<ToolActionResult> => {
        try {
          const stagehand = await context.getStagehand();
    
          await stagehand.page.act({
            action: params.action,
            variables: params.variables,
          });
    
          return {
            content: [
              {
                type: "text",
                text: `Action performed: ${params.action}`,
              },
            ],
          };
        } catch (error) {
          const errorMsg = error instanceof Error ? error.message : String(error);
          throw new Error(`Failed to perform action: ${errorMsg}`);
        }
      };
    
      return {
        action,
        waitForNetwork: false,
      };
    }
  • The tool schema defining the name 'browserbase_stagehand_act', description, and references the input schema for validation.
    const actSchema: ToolSchema<typeof ActInputSchema> = {
      name: "browserbase_stagehand_act",
      description:
        "Performs an action on a web page element. Act actions should be as atomic and " +
        'specific as possible, i.e. "Click the sign in button" or "Type \'hello\' into the search input". ' +
        'AVOID actions that are more than one step, i.e. "Order me pizza" or "Send an email to Paul ' +
        'asking him to call me".',
      inputSchema: ActInputSchema,
    };
  • The Zod input schema defining the parameters: action (required string) and optional variables object.
    const ActInputSchema = z.object({
      action: z
        .string()
        .describe(
          "The action to perform. Should be as atomic and specific as possible, " +
            "i.e. 'Click the sign in button' or 'Type 'hello' into the search input'. AVOID actions that are more than one " +
            "step, i.e. 'Order me pizza' or 'Send an email to Paul asking him to call me'. The instruction should be just as specific as possible, " +
            "and have a strong correlation to the text on the page. If unsure, use observe before using act.",
        ),
      variables: z
        .object({})
        .optional()
        .describe(
          "Variables used in the action template. ONLY use variables if you're dealing " +
            "with sensitive data or dynamic content. For example, if you're logging in to a website, " +
            "you can use a variable for the password. When using variables, you MUST have the variable " +
            'key in the action template. For example: {"action": "Fill in the password", "variables": {"password": "123456"}}',
        ),
    });
  • The central TOOLS array where actTool (containing the browserbase_stagehand_act tool) is registered alongside other tools for use in the MCP server.
    export const TOOLS = [
      ...multiSessionTools,
      ...sessionTools,
      navigateTool,
      actTool,
      extractTool,
      observeTool,
      screenshotTool,
      getUrlTool,
    ];
  • src/tools/act.ts:71-77 (registration)
    The actTool object that registers the schema and handler together, exported for inclusion in the tools index.
    const actTool: Tool<typeof ActInputSchema> = {
      capability: "core",
      schema: actSchema,
      handle: handleAct,
    };
    
    export default actTool;
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively communicates that actions should be atomic and specific, which is crucial for web automation. However, it lacks details on error handling, performance characteristics, or what happens if the element isn't found, leaving some behavioral aspects unclear.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: one stating the purpose with examples, and another providing clear usage boundaries. Every sentence adds essential value without redundancy, making it easy to parse and apply.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 parameters, 100% schema coverage, and no output schema, the description provides strong contextual guidance on usage and limitations. It effectively complements the schema by emphasizing atomicity and referencing sibling tools. The main gap is the lack of output information, but given the tool's focus on action rather than observation, this is somewhat mitigated.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description reinforces the guidance for the 'action' parameter but doesn't add significant semantic value beyond what's in the schema descriptions. This meets the baseline expectation when schema coverage is complete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Performs an action on a web page element' with specific examples like clicking buttons or typing text. It distinguishes itself from siblings like 'observe' or 'navigate' by focusing on direct interaction rather than observation or navigation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool: for atomic, specific actions on web elements, and when not to use it: for multi-step actions. It references the 'observe' sibling as an alternative when unsure, offering clear context for tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/vaibhavAtlys/mcp-server-browserbase'

If you have feedback or need assistance with the MCP directory API, please join our Discord server