Skip to main content
Glama
RonsDad
by RonsDad

multi_browserbase_stagehand_extract_session

Extract structured data and text content from web pages using specific instructions and JSON schema for scraping, information gathering, or content collection.

Instructions

Extracts structured information and text content from the current web page based on specific instructions and a defined schema. This tool is ideal for scraping data, gathering information, or pulling specific content from web pages. Use this tool when you need to get text content, data, or information from a page rather than interacting with elements. For interactive elements like buttons, forms, or clickable items, use the observe tool instead. The extraction works best when you provide clear, specific instructions about what to extract and a well-defined JSON schema for the expected output format. This ensures the extracted data is properly structured and usable. (for a specific session)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
sessionIdYesThe session ID to use
instructionYesThe specific instruction for what information to extract from the current page. Be as detailed and specific as possible about what you want to extract. For example: 'Extract all product names and prices from the listing page' or 'Get the article title, author, and publication date from this blog post'. The more specific your instruction, the better the extraction results will be. Avoid vague instructions like 'get everything' or 'extract the data'. Instead, be explicit about the exact elements, text, or information you need.

Implementation Reference

  • Core handler logic for the 'multi_browserbase_stagehand_extract_session' tool. It extracts the sessionId from input, retrieves the Stagehand session, overrides context methods to target that session, and delegates execution to the original extract tool handler.
    handle: async (
      context: Context,
      params: z.infer<typeof newInputSchema>,
    ): Promise<ToolResult> => {
      const { sessionId, ...originalParams } = params;
    
      // Get the session
      const session = stagehandStore.get(sessionId);
      if (!session) {
        throw new Error(`Session ${sessionId} not found`);
      }
    
      // Create a temporary context that points to the specific session
      const sessionContext = Object.create(context);
      sessionContext.currentSessionId =
        session.metadata?.bbSessionId || sessionId;
      sessionContext.getStagehand = async () => session.stagehand;
      sessionContext.getActivePage = async () => session.page;
      sessionContext.getActiveBrowser = async () => session.browser;
    
      // Call the original tool's handler with the session-specific context
      return originalTool.handle(sessionContext, originalParams);
    },
  • Generates the tool schema for 'multi_browserbase_stagehand_extract_session', setting the name by prefixing 'multi_' to original name 'browserbase_stagehand_extract' and suffixing '_session', updating description, and adding 'sessionId' to input schema.
    schema: {
      name: `${namePrefix}${originalTool.schema.name}${nameSuffix}`,
      description: `${originalTool.schema.description} (for a specific session)`,
      inputSchema: newInputSchema,
    },
  • Registers/exports the specific multi-session extract tool instance named 'multi_browserbase_stagehand_extract_session' using the factory with appropriate prefixes.
    export const extractWithSessionTool = createMultiSessionAwareTool(extractTool, {
      namePrefix: "multi_",
      nameSuffix: "_session",
    });
  • Delegated handler from original extract tool, performs the actual extraction using stagehand.page.extract() with the provided instruction.
    async function handleExtract(
      context: Context,
      params: ExtractInput,
    ): Promise<ToolResult> {
      const action = async (): Promise<ToolActionResult> => {
        try {
          const stagehand = await context.getStagehand();
    
          const extraction = await stagehand.page.extract(params.instruction);
    
          return {
            content: [
              {
                type: "text",
                text: `Extracted content:\n${JSON.stringify(extraction, null, 2)}`,
              },
            ],
          };
        } catch (error) {
          const errorMsg = error instanceof Error ? error.message : String(error);
          throw new Error(`Failed to extract content: ${errorMsg}`);
        }
      };
    
      return {
        action,
        waitForNetwork: false,
      };
    }
  • Base schema for the extract tool, providing the name 'browserbase_stagehand_extract' used in constructing the multi-session tool name, and input schema requiring 'instruction'.
    const extractSchema: ToolSchema<typeof ExtractInputSchema> = {
      name: "browserbase_stagehand_extract",
      description:
        "Extracts structured information and text content from the current web page based on specific instructions " +
        "and a defined schema. This tool is ideal for scraping data, gathering information, or pulling specific " +
        "content from web pages. Use this tool when you need to get text content, data, or information from a page " +
        "rather than interacting with elements. For interactive elements like buttons, forms, or clickable items, " +
        "use the observe tool instead. The extraction works best when you provide clear, specific instructions " +
        "about what to extract and a well-defined JSON schema for the expected output format. This ensures " +
        "the extracted data is properly structured and usable.",
      inputSchema: ExtractInputSchema,
    };
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes the tool's function as extraction/scraping, specifies it works on the 'current web page', mentions it's 'for a specific session', and provides guidance on how to achieve best results with clear instructions and schemas. However, it doesn't mention potential limitations like rate limits, authentication needs, or error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded with the core purpose in the first sentence. While it contains some redundancy (e.g., repeating the need for clear instructions), most sentences add value by providing usage guidance and best practices. The parenthetical note at the end could be better integrated.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-parameter extraction tool with no annotations and no output schema, the description provides good context about what the tool does, when to use it, and how to use it effectively. It covers the core functionality well but doesn't describe the return format or potential limitations, which would be helpful given the absence of output schema and annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description mentions providing 'clear, specific instructions' and 'a well-defined JSON schema', which aligns with the schema's instruction parameter description but doesn't add significant semantic value beyond what's already in the structured schema documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as extracting structured information and text content from web pages based on instructions and a schema. It specifically distinguishes this extraction-focused tool from interactive sibling tools like 'observe', making the verb+resource+scope explicit and differentiating it from alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('when you need to get text content, data, or information from a page') and when not to use it ('For interactive elements like buttons, forms, or clickable items, use the observe tool instead'). It names the specific alternative tool and provides clear context for appropriate usage scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/RonsDad/mcp-server-browserbase'

If you have feedback or need assistance with the MCP directory API, please join our Discord server