browse
Navigate to web pages and extract visible content including titles and text for web interaction and data collection.
Instructions
Browse to a URL and return the page title and visible text
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The URL to navigate to | |
| headless | No | Whether to run the browser in headless mode | |
| waitFor | No | Time to wait after page load (milliseconds) |
Implementation Reference
- src/index.ts:54-129 (handler)The handler function for the 'browse' tool. It launches a stealth Chromium browser instance using Patchright, creates a new page, navigates to the specified URL, extracts the page title and visible text content, takes a screenshot, stores the instance globally, and returns a structured response with preview text and resource locations.async ({ url, headless, waitFor }: { url: string; headless: boolean; waitFor: number }) => { try { // Generate unique IDs for tracking const browserId = randomUUID(); const pageId = randomUUID(); // Launch browser with stealth settings const browser = await chromium.launch({ headless: headless, args: [ '--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage', '--disable-accelerated-2d-canvas', '--no-first-run', '--no-zygote', '--disable-gpu' ] }); // Create a context and page const context = await browser.newContext({ viewport: null // Avoid detection by not using default viewport }); const page = await context.newPage(); // Store browser and page references browserInstances.set(browserId, { browser, pages: new Map([[pageId, page]]) }); // Navigate to the URL using isolated context for stealth await page.goto(url); await page.waitForTimeout(waitFor); // Get page title const title = await page.title(); // Extract visible text with stealth (isolated context) // This ensures the page doesn't detect us using Runtime.evaluate const visibleText = await page.evaluate(` Array.from(document.querySelectorAll('body, body *')) .filter(element => { const style = window.getComputedStyle(element); return style.display !== 'none' && style.visibility !== 'hidden' && style.opacity !== '0'; }) .map(element => element.textContent) .filter(text => text && text.trim().length > 0) .join('\\n') `) as string; // Take a screenshot const screenshotPath = path.join(TEMP_DIR, `screenshot-${pageId}.png`); await page.screenshot({ path: screenshotPath }); // Return a formatted response for the AI model return { content: [ { type: "text", text: `Successfully browsed to: ${url}\n\nPage Title: ${title}\n\nVisible Text Preview:\n${visibleText.substring(0, 1500)}${visibleText.length > 1500 ? '...' : ''}\n\nBrowser ID: ${browserId}\nPage ID: ${pageId}\nScreenshot saved to: ${screenshotPath}` } ] }; } catch (error) { return { content: [ { type: "text", text: `Failed to browse: ${error}` } ] }; } }
- src/index.ts:49-53 (schema)Input schema for the 'browse' tool defined using Zod, validating URL (required), headless mode (optional boolean, default false), and wait time (optional number, default 1000ms).{ url: z.string().url().describe("The URL to navigate to"), headless: z.boolean().default(false).describe("Whether to run the browser in headless mode"), waitFor: z.number().default(1000).describe("Time to wait after page load (milliseconds)") },
- src/index.ts:46-130 (registration)The registration of the 'browse' tool on the MCP server using server.tool(), specifying the tool name, description, input schema, and handler function.server.tool( "browse", "Browse to a URL and return the page title and visible text", { url: z.string().url().describe("The URL to navigate to"), headless: z.boolean().default(false).describe("Whether to run the browser in headless mode"), waitFor: z.number().default(1000).describe("Time to wait after page load (milliseconds)") }, async ({ url, headless, waitFor }: { url: string; headless: boolean; waitFor: number }) => { try { // Generate unique IDs for tracking const browserId = randomUUID(); const pageId = randomUUID(); // Launch browser with stealth settings const browser = await chromium.launch({ headless: headless, args: [ '--no-sandbox', '--disable-setuid-sandbox', '--disable-dev-shm-usage', '--disable-accelerated-2d-canvas', '--no-first-run', '--no-zygote', '--disable-gpu' ] }); // Create a context and page const context = await browser.newContext({ viewport: null // Avoid detection by not using default viewport }); const page = await context.newPage(); // Store browser and page references browserInstances.set(browserId, { browser, pages: new Map([[pageId, page]]) }); // Navigate to the URL using isolated context for stealth await page.goto(url); await page.waitForTimeout(waitFor); // Get page title const title = await page.title(); // Extract visible text with stealth (isolated context) // This ensures the page doesn't detect us using Runtime.evaluate const visibleText = await page.evaluate(` Array.from(document.querySelectorAll('body, body *')) .filter(element => { const style = window.getComputedStyle(element); return style.display !== 'none' && style.visibility !== 'hidden' && style.opacity !== '0'; }) .map(element => element.textContent) .filter(text => text && text.trim().length > 0) .join('\\n') `) as string; // Take a screenshot const screenshotPath = path.join(TEMP_DIR, `screenshot-${pageId}.png`); await page.screenshot({ path: screenshotPath }); // Return a formatted response for the AI model return { content: [ { type: "text", text: `Successfully browsed to: ${url}\n\nPage Title: ${title}\n\nVisible Text Preview:\n${visibleText.substring(0, 1500)}${visibleText.length > 1500 ? '...' : ''}\n\nBrowser ID: ${browserId}\nPage ID: ${pageId}\nScreenshot saved to: ${screenshotPath}` } ] }; } catch (error) { return { content: [ { type: "text", text: `Failed to browse: ${error}` } ] }; } } );
- src/index.ts:28-35 (helper)Helper data structure: Interface and Map to persist browser instances and pages across multiple tool invocations, enabling subsequent tools like interact and extract.// Keep track of browser instances and pages interface BrowserInstance { browser: Browser; pages: Map<string, Page>; } const browserInstances = new Map<string, BrowserInstance>();