Skip to main content
Glama

firefox-devtools-mcp

migration-to-bidi.md9.48 kB
# Migration from RDP to WebDriver BiDi ## Summary This document describes the migration from Firefox's legacy Remote Debugging Protocol (RDP) to the modern WebDriver BiDi protocol using Selenium WebDriver. ## Background ### The Problem with RDP The initial implementation used Firefox's native Remote Debugging Protocol (RDP): - **Complex custom implementation** - Required custom TCP transport with length-prefixed JSON framing - **Actor-based architecture** - Complex state management with actor IDs - **Limited documentation** - Firefox RDP is poorly documented and Firefox-specific - **Maintenance burden** - Custom protocol parsing, connection handling, and error recovery - **Deprecated protocol** - Mozilla is moving away from RDP in favor of WebDriver BiDi ### Why WebDriver BiDi? WebDriver BiDi is the modern, cross-browser automation protocol: - **W3C Standard** - Well-documented, standardized protocol - **Future-proof** - Actively developed and maintained by browser vendors - **Better DevTools access** - Improved console, network, and performance monitoring - **Simplified implementation** - Use battle-tested `selenium-webdriver` library - **Cross-browser potential** - Same API works with Chrome, Edge, Safari (future) ## Migration Approach ### What Changed **Before (RDP):** ``` /src/firefox/ ├── rdp-client.ts ❌ Custom RDP implementation (~500 lines) ├── transport.ts ❌ Custom TCP framing (~200 lines) ├── bidi-client.ts ❌ Custom BiDi WebSocket client ├── launcher.ts ❌ Custom Firefox launcher ├── devtools.ts ❌ Wrapper combining RDP + BiDi └── types.ts RDP-specific types ``` **After (Selenium BiDi):** ``` /src/firefox/ ├── devtools.ts ✅ Selenium WebDriver wrapper (~200 lines) └── types.ts ✅ BiDi types ``` **Reduction:** ~1000 lines of custom code → ~200 lines of wrapper code ### Implementation Strategy Instead of writing a custom BiDi client, we use `selenium-webdriver`: ```typescript import { Builder, Browser, WebDriver } from 'selenium-webdriver'; import firefox from 'selenium-webdriver/firefox.js'; // Enable BiDi const firefoxOptions = new firefox.Options(); firefoxOptions.enableBidi(); // Launch Firefox const driver = await new Builder() .forBrowser(Browser.FIREFOX) .setFirefoxOptions(firefoxOptions) .build(); // Get browsing context const contextId = await driver.getWindowHandle(); // Subscribe to console events (BiDi) const bidi = await driver.getBidi(); await bidi.subscribe('log.entryAdded', contextId, (event) => { console.log(event.params); }); ``` ## Key Technical Decisions ### 1. Direct Proxy Pattern The `FirefoxDevTools` class is a **thin wrapper** around Selenium's `WebDriver`: ```typescript class FirefoxDevTools { private driver: WebDriver | null = null; async evaluate(script: string): Promise<unknown> { // Direct passthrough - no "smart" modifications return await this.driver.executeScript(script); } } ``` **Lesson learned:** Don't try to be clever. Selenium knows what it's doing. ### 2. The `evaluate()` Anti-Pattern **❌ What DIDN'T work:** ```typescript async evaluate(script: string): Promise<unknown> { // Trying to be "helpful" by auto-adding return let finalScript = script.trim(); if (!finalScript.startsWith('return ')) { finalScript = `return ${finalScript}`; // ❌ BREAKS multi-line scripts! } return await this.driver.executeScript(finalScript); } ``` **Problem:** When user passes: ```javascript evaluate(` return { timing: {...} }; `) ``` It becomes: ```javascript return return { // ❌ Syntax error! timing: {...} }; ``` **✅ What WORKS:** ```typescript async evaluate(script: string): Promise<unknown> { // Just pass it through. Selenium handles it. return await this.driver.executeScript(script); } ``` **Key insight:** Selenium already handles single-line expressions vs. multi-line scripts correctly. Don't second-guess it. ### 3. Console Listener Setup **Critical order of operations:** ```typescript async connect(): Promise<void> { // 1. Launch Firefox this.driver = await new Builder() .forBrowser(Browser.FIREFOX) .setFirefoxOptions(firefoxOptions) .build(); // 2. Get browsing context ID this.currentContextId = await this.driver.getWindowHandle(); // 3. Subscribe to console BEFORE navigation const bidi = await this.driver.getBidi(); await bidi.subscribe('log.entryAdded', this.currentContextId, callback); // 4. Navigate (console events will now be captured) if (this.options.startUrl) { await this.driver.get(this.options.startUrl); } } ``` **Why this order matters:** - Console listener must be attached to the browsing context - Context ID is obtained from `getWindowHandle()` - Listener must be active BEFORE navigation to capture early console messages ## API Compatibility The high-level API remains unchanged for MCP tools: ```typescript // Same API as before await firefox.connect(); await firefox.navigate('https://example.com'); const title = await firefox.evaluate('return document.title'); const messages = await firefox.getConsoleMessages(); await firefox.close(); ``` **MCP tools work without changes** - the migration is transparent to end users. ## Dependencies ### Added ```json { "dependencies": { "selenium-webdriver": "^4.36.0" }, "devDependencies": { "geckodriver": "^6.0.2" } } ``` ### Removed - Custom RDP transport code - Custom BiDi WebSocket client - Custom Firefox launcher - `ws` package (no longer needed) ## Build Configuration Added `selenium-webdriver` as external dependency in `tsup.config.ts`: ```typescript export default defineConfig({ external: [ 'selenium-webdriver' // Don't bundle, load from node_modules ], }); ``` **Why external:** Selenium uses dynamic requires for browser drivers, which don't work when bundled. ## Testing ### Test Script Evolution **Original working script** (`test-firefox-bidi.js`): ```javascript const driver = await new Builder() .forBrowser(Browser.FIREFOX) .setFirefoxOptions(firefoxOptions) .build(); const bidi = await driver.getBidi(); await bidi.subscribe('log.entryAdded', windowHandle, callback); ``` **Final implementation** (`devtools.ts`): ```typescript // EXACT copy of working script logic this.driver = await new Builder() .forBrowser(Browser.FIREFOX) .setFirefoxOptions(firefoxOptions) .build(); const bidi = await this.driver.getBidi(); await bidi.subscribe('log.entryAdded', this.currentContextId, callback); ``` **Key principle:** When you have working code, **don't modify it**. Wrap it. ## Debugging Journey ### Issue: `evaluate()` returning `null` **Symptoms:** ``` ✅ Navigation complete - Title: Centrum.cz ❌ evaluate('document.title') returns: null ``` **Root cause:** Auto-adding `return` broke multi-line scripts **Investigation:** ```javascript await firefox.evaluate('return 1 + 1'); // ✅ 2 await firefox.evaluate('return document.title'); // ✅ "Centrum.cz" await firefox.evaluate('document.title'); // ❌ null ``` **Solution:** Remove all "smart" logic. Just proxy to `executeScript()`. ### Issue: Console events not captured **Symptoms:** ``` ✅ Console listener active ❌ Found 0 console messages ``` **Root cause:** Testing showed navigation worked but context was wrong **Investigation:** - Verified listener was subscribed - Verified navigation completed - Found that context ID needs to be obtained BEFORE subscribing **Solution:** Get `windowHandle` before `bidi.subscribe()`, not after. ## Performance Impact ### Bundle Size | Metric | RDP | BiDi (Selenium) | Change | |--------|-----|-----------------|--------| | Source code | ~1200 lines | ~200 lines | **-83%** | | Bundle size | 525 KB | 524 KB | -0.2% | | Dependencies | ws | selenium-webdriver | N/A | ### Runtime Performance - **Startup:** Slightly slower (geckodriver initialization) - **Execution:** Comparable (same Firefox backend) - **Console capture:** Faster (native BiDi events vs RDP polling) ## Future Improvements ### Short-term - [ ] Network monitoring via BiDi (`network.beforeRequestSent`, `network.responseCompleted`) - [ ] Better error messages from Selenium - [ ] Connection retry logic ### Long-term - [ ] Multi-browser support (Chrome, Edge) - [ ] Advanced BiDi features (request interception, authentication) - [ ] Session persistence and reconnection ## Lessons Learned 1. **Use existing libraries** - Don't reinvent the wheel. `selenium-webdriver` is battle-tested. 2. **Direct proxy pattern** - When wrapping a library, don't add "helpful" logic. Just pass through. 3. **Copy working code** - If you have a working test script, make your implementation **identical** to it. 4. **Test incrementally** - Small, focused tests (`return 1+1`, `return document.title`) reveal issues quickly. 5. **Trust the library** - Selenium knows how to handle scripts. Don't second-guess it. ## Conclusion The migration from custom RDP to Selenium WebDriver BiDi was successful: - ✅ **83% less code** to maintain - ✅ **Simpler architecture** - standard library vs custom protocol - ✅ **Better future** - W3C standard, actively developed - ✅ **Same API** - transparent to MCP tool users - ✅ **Working console capture** - BiDi events in real-time **Final takeaway:** Sometimes the best code is the code you don't write. Use proven libraries and keep it simple.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/freema/firefox-devtools-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server