multi_browserbase_stagehand_extract_session
Extract structured data and text content from web pages using specific instructions and JSON schema for scraping, information gathering, or content collection.
Instructions
Extracts structured information and text content from the current web page based on specific instructions and a defined schema. This tool is ideal for scraping data, gathering information, or pulling specific content from web pages. Use this tool when you need to get text content, data, or information from a page rather than interacting with elements. For interactive elements like buttons, forms, or clickable items, use the observe tool instead. The extraction works best when you provide clear, specific instructions about what to extract and a well-defined JSON schema for the expected output format. This ensures the extracted data is properly structured and usable. (for a specific session)
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| sessionId | Yes | The session ID to use | |
| instruction | Yes | The specific instruction for what information to extract from the current page. Be as detailed and specific as possible about what you want to extract. For example: 'Extract all product names and prices from the listing page' or 'Get the article title, author, and publication date from this blog post'. The more specific your instruction, the better the extraction results will be. Avoid vague instructions like 'get everything' or 'extract the data'. Instead, be explicit about the exact elements, text, or information you need. |
Implementation Reference
- src/tools/multiSession.ts:56-78 (handler)Core handler logic for the 'multi_browserbase_stagehand_extract_session' tool. It extracts the sessionId from input, retrieves the Stagehand session, overrides context methods to target that session, and delegates execution to the original extract tool handler.
handle: async ( context: Context, params: z.infer<typeof newInputSchema>, ): Promise<ToolResult> => { const { sessionId, ...originalParams } = params; // Get the session const session = stagehandStore.get(sessionId); if (!session) { throw new Error(`Session ${sessionId} not found`); } // Create a temporary context that points to the specific session const sessionContext = Object.create(context); sessionContext.currentSessionId = session.metadata?.bbSessionId || sessionId; sessionContext.getStagehand = async () => session.stagehand; sessionContext.getActivePage = async () => session.page; sessionContext.getActiveBrowser = async () => session.browser; // Call the original tool's handler with the session-specific context return originalTool.handle(sessionContext, originalParams); }, - src/tools/multiSession.ts:51-55 (schema)Generates the tool schema for 'multi_browserbase_stagehand_extract_session', setting the name by prefixing 'multi_' to original name 'browserbase_stagehand_extract' and suffixing '_session', updating description, and adding 'sessionId' to input schema.
schema: { name: `${namePrefix}${originalTool.schema.name}${nameSuffix}`, description: `${originalTool.schema.description} (for a specific session)`, inputSchema: newInputSchema, }, - src/tools/multiSession.ts:253-256 (registration)Registers/exports the specific multi-session extract tool instance named 'multi_browserbase_stagehand_extract_session' using the factory with appropriate prefixes.
export const extractWithSessionTool = createMultiSessionAwareTool(extractTool, { namePrefix: "multi_", nameSuffix: "_session", }); - src/tools/extract.ts:34-62 (handler)Delegated handler from original extract tool, performs the actual extraction using stagehand.page.extract() with the provided instruction.
async function handleExtract( context: Context, params: ExtractInput, ): Promise<ToolResult> { const action = async (): Promise<ToolActionResult> => { try { const stagehand = await context.getStagehand(); const extraction = await stagehand.page.extract(params.instruction); return { content: [ { type: "text", text: `Extracted content:\n${JSON.stringify(extraction, null, 2)}`, }, ], }; } catch (error) { const errorMsg = error instanceof Error ? error.message : String(error); throw new Error(`Failed to extract content: ${errorMsg}`); } }; return { action, waitForNetwork: false, }; } - src/tools/extract.ts:21-32 (schema)Base schema for the extract tool, providing the name 'browserbase_stagehand_extract' used in constructing the multi-session tool name, and input schema requiring 'instruction'.
const extractSchema: ToolSchema<typeof ExtractInputSchema> = { name: "browserbase_stagehand_extract", description: "Extracts structured information and text content from the current web page based on specific instructions " + "and a defined schema. This tool is ideal for scraping data, gathering information, or pulling specific " + "content from web pages. Use this tool when you need to get text content, data, or information from a page " + "rather than interacting with elements. For interactive elements like buttons, forms, or clickable items, " + "use the observe tool instead. The extraction works best when you provide clear, specific instructions " + "about what to extract and a well-defined JSON schema for the expected output format. This ensures " + "the extracted data is properly structured and usable.", inputSchema: ExtractInputSchema, };