Webby MCP Server

MIT License

Overview InspectNew Endpoints Schema Related Servers Reviews Score

webby-mcp

BROWSER_AUTOMATION_PLAN.md•9.41 kB

# Browser Automation Implementation Plan ## Architecture Overview All three placeholder tools will use **Playwright** for headless browser automation: 1. **WebPageTest** - Form submission + result scraping 2. **Lighthouse** - Chrome DevTools Protocol + lighthouse npm package 3. **WebCheck** - Web scraping (or self-hosted API calls) ## Shared Browser Utilities Create `src/shared/browser-utils.ts`: ```typescript import { chromium, Browser, Page, BrowserContext } from 'playwright'; export interface BrowserOptions { headless?: boolean; timeout?: number; userAgent?: string; } export class BrowserManager { private browser: Browser | null = null; private context: BrowserContext | null = null; async launch(options: BrowserOptions = {}): Promise<void> { this.browser = await chromium.launch({ headless: options.headless !== false, args: ['--no-sandbox', '--disable-setuid-sandbox'], }); this.context = await this.browser.newContext({ userAgent: options.userAgent, }); } async newPage(): Promise<Page> { if (!this.context) throw new Error('Browser not launched'); return await this.context.newPage(); } async close(): Promise<void> { if (this.context) await this.context.close(); if (this.browser) await this.browser.close(); this.browser = null; this.context = null; } isLaunched(): boolean { return this.browser !== null; } } // Singleton instance for reuse across tests let globalBrowserManager: BrowserManager | null = null; export async function getBrowserManager(): Promise<BrowserManager> { if (!globalBrowserManager) { globalBrowserManager = new BrowserManager(); await globalBrowserManager.launch(); } return globalBrowserManager; } export async function closeBrowserManager(): Promise<void> { if (globalBrowserManager) { await globalBrowserManager.close(); globalBrowserManager = null; } } ``` ## Tool-Specific Implementations ### 1. WebPageTest (Form Automation + Scraping) **Strategy**: Navigate to webpagetest.org, fill form, submit, poll for results, scrape results page **Implementation** (`src/performance/webpagetest.ts`): ```typescript async function submitWebPageTest(url: string): Promise<string> { const browser = await getBrowserManager(); const page = await browser.newPage(); try { // Navigate to WebPageTest await page.goto('https://www.webpagetest.org/'); // Fill form await page.fill('input[name="url"]', url); await page.selectOption('select[name="location"]', 'Dulles:Chrome'); // Submit await page.click('input[type="submit"]'); // Wait for results page await page.waitForURL(/.*\/result\/.*/); // Extract test ID from URL const testId = page.url().match(/\/result\/([^\/]+)/)?.[1]; return testId; } finally { await page.close(); } } async function scrapeWebPageTestResults(testId: string): Promise<WebPageTestResult> { const browser = await getBrowserManager(); const page = await browser.newPage(); try { await page.goto(`https://www.webpagetest.org/result/${testId}/`); // Wait for results to load await page.waitForSelector('.result-summary', { timeout: 300000 }); // 5 min max // Scrape results const grade = await page.textContent('.grade'); const loadTime = await page.textContent('.loadTime'); const fcp = await page.textContent('[data-metric="firstContentfulPaint"]'); return { tool: 'webpagetest', success: true, test_id: testId, results_url: page.url(), grade, summary: { loadTime, firstContentfulPaint: fcp }, }; } finally { await page.close(); } } ``` **Pros**: Free 300 tests/month, no API key needed **Cons**: Slower than API (form submission + polling), fragile to UI changes **Alternative**: Detect if user has API key, use API instead ### 2. Lighthouse (Chrome DevTools Protocol) **Strategy**: Use official `lighthouse` npm package with Playwright's Chrome instance **Add dependency**: ```bash npm install lighthouse chrome-launcher ``` **Implementation** (`src/shared/lighthouse.ts`): ```typescript import lighthouse from 'lighthouse'; import { launch } from 'chrome-launcher'; async function analyzeLighthouse(url: string, options: LighthouseOptions = {}) { const chrome = await launch({ chromeFlags: ['--headless', '--no-sandbox'], }); try { const runnerResult = await lighthouse(url, { port: chrome.port, output: 'json', onlyCategories: options.categories || ['performance', 'accessibility', 'seo', 'best-practices'], formFactor: options.formFactor || 'mobile', }); const results = runnerResult.lhr; return { tool: 'lighthouse', success: true, url, formFactor: options.formFactor || 'mobile', performance_score: results.categories.performance?.score * 100, accessibility_score: results.categories.accessibility?.score * 100, seo_score: results.categories.seo?.score * 100, best_practices_score: results.categories['best-practices']?.score * 100, metrics: { firstContentfulPaint: results.audits['first-contentful-paint'].numericValue, largestContentfulPaint: results.audits['largest-contentful-paint'].numericValue, // ... other metrics }, }; } finally { await chrome.kill(); } } ``` **Pros**: Official tool, comprehensive results, runs locally **Cons**: Slower than PageSpeed API (no caching), more CPU intensive **Note**: PageSpeed Insights already runs Lighthouse, so this is redundant unless you need offline testing ### 3. WebCheck (Self-Hosted or Web Scraping) **Option A: Self-Hosted WebCheck API** If user self-hosts WebCheck, we can call their API directly: ```typescript async function analyzeWebCheck(url: string, options: WebCheckOptions = {}) { const instanceUrl = options.instanceUrl || 'http://localhost:3000'; // Call self-hosted WebCheck API const response = await fetch(`${instanceUrl}/api/analyze`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ url }), }); return await response.json(); } ``` **Option B: Scrape Public WebCheck Instance** ```typescript async function analyzeWebCheckScrape(url: string): Promise<WebCheckResult> { const browser = await getBrowserManager(); const page = await browser.newPage(); try { // Navigate to web-check.xyz await page.goto('https://web-check.xyz/'); // Enter URL await page.fill('input[type="url"]', url); await page.click('button[type="submit"]'); // Wait for results await page.waitForSelector('.results-container', { timeout: 60000 }); // Scrape results const sitemapFound = await page.isVisible('.sitemap-result .success'); const robotsFound = await page.isVisible('.robots-result .success'); return { tool: 'webcheck', success: true, url, seo: { sitemap_found: sitemapFound, robots_txt_found: robotsFound, }, }; } finally { await page.close(); } } ``` **Pros (Self-Hosted)**: Full control, comprehensive results **Cons (Self-Hosted)**: Requires self-hosting **Pros (Scraping)**: No setup needed **Cons (Scraping)**: Fragile to UI changes, slower ## Configuration Options Add to `package.json` or environment variables: ```json { "webby": { "browser": { "headless": true, "timeout": 120000, "reuseInstance": true }, "webpagetest": { "useAPI": false, // Set to true if user has API key "apiKey": null }, "lighthouse": { "enabled": true, // Set to false to use PageSpeed Insights instead "chromePath": null // Auto-detect if null }, "webcheck": { "mode": "disabled", // "self-hosted" | "scrape" | "disabled" "instanceUrl": "http://localhost:3000" } } } ``` ## Recommended Implementation Order 1. **✅ Keep PageSpeed Insights** as primary (fast, reliable, free API) 2. **Implement WebPageTest** via browser automation (adds value - different testing locations) 3. **Skip Lighthouse** (redundant with PageSpeed Insights unless offline testing needed) 4. **Skip WebCheck** (minimal SEO value, better covered by Lighthouse SEO) ## Performance Considerations **Browser Instance Management**: - ✅ Reuse browser instance across tests (singleton pattern) - ✅ Configure timeouts appropriately (WebPageTest can take 2-5 minutes) - ✅ Clean up browser on MCP server shutdown **Parallelization**: - ⚠️ Limit concurrent browser tests (memory intensive) - ✅ Run non-browser tests (PageSpeed, Mozilla Observatory, SSL Labs) in parallel - ✅ Queue browser tests if multiple requested simultaneously ## Error Handling All browser automation should handle: - Network timeouts - Element not found errors - UI changes (selector updates) - Browser crashes - Graceful degradation (fall back to PageSpeed if Lighthouse fails) ## Testing Strategy Create integration tests: ```typescript // __tests__/webpagetest.test.ts test('WebPageTest automation', async () => { const result = await analyzeWebPageTest('https://example.com'); expect(result.success).toBe(true); expect(result.test_id).toBeDefined(); }); ``` Mock browser for unit tests: ```typescript jest.mock('playwright', () => ({ chromium: { launch: jest.fn(() => mockBrowser), }, })); ```

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cordlesssteve/webby-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server