DevServer MCP

playwright_testing_challenges.md•7.68 KiB

# Playwright Testing Challenges and Architecture Analysis ## Background While implementing tests for the `browser_resize` functionality, we discovered fundamental architectural issues that make it difficult to write meaningful tests that follow our testing guidelines. This document analyzes the problem, explains what was considered, and proposes solutions. ## The Testing Guidelines Conflict Our testing guidelines (from `docs/writing_tests.md`) state: 1. **Only mock at system boundaries** (external APIs, third-party services) 2. **Never mock your own functions or classes** 3. **If a test needs more than 2 mocks, stop and reconsider** 4. **Test user behavior, not implementation details** ## Current Architecture The Playwright integration follows this structure: ``` MCP Tool (browser_resize) ↓ DevServerManager.playwright_resize() ↓ PlaywrightOperator.resize() ↓ Playwright API (page.set_viewport_size()) ``` ### The Problem 1. **Tight Coupling**: `PlaywrightOperator` is instantiated directly in `DevServerManager.__init__()`: ```python def _init_playwright_if_enabled(self): if self._playwright_config_enabled: from devserver_mcp.playwright import PlaywrightOperator self._playwright_operator = PlaywrightOperator(headless=True) ``` 2. **No Dependency Injection**: The operator is created internally, making it impossible to inject a test double without mocking our own class. 3. **Complex System Boundary**: The actual Playwright API requires multiple mock objects: - `async_playwright()` context manager - `playwright.chromium.launch()` → Browser - `browser.new_context()` → Context - `context.new_page()` → Page - `page.set_viewport_size()` → The actual method we care about This violates the "no more than 2 mocks" rule. ## What Was Considered ### 1. Mock PlaywrightOperator (Current Approach) ```python with patch("devserver_mcp.playwright.PlaywrightOperator") as mock_playwright: mock_instance = MagicMock() mock_instance.resize = AsyncMock(return_value={...}) ``` **Problem**: Violates "never mock your own classes" rule. Tests become circular - we're testing that our mocks work, not actual behavior. ### 2. Mock at True System Boundary ```python mock_page = MagicMock() mock_page.set_viewport_size = AsyncMock() # ... setup 4+ more mocks for browser, context, etc. with patch("playwright.async_api.async_playwright", ...): ``` **Problem**: Requires 5+ mocks, violating the complexity rule. Tests become brittle and coupled to Playwright's API structure. ### 3. Integration Tests with Real Playwright ```python # No mocks, use actual Playwright manager = DevServerManager(config) result = await manager.playwright_resize(1920, 1080) # Verify by taking screenshot and checking dimensions? ``` **Problem**: Requires browser installation, slow tests, CI complexity, and how do you verify viewport size without depending on implementation? ## Refactoring Suggestions ### 1. Dependency Injection Pattern ```python class DevServerManager: def __init__(self, config: Config, playwright_factory=None): # Allow injection of factory self._playwright_factory = playwright_factory or PlaywrightOperator self._init_playwright_if_enabled() def _init_playwright_if_enabled(self): if self._playwright_config_enabled: self._playwright_operator = self._playwright_factory(headless=True) ``` This allows tests to inject a test double at the boundary between our code and the Playwright wrapper. ### 2. Interface Segregation ```python from abc import ABC, abstractmethod class BrowserAutomation(ABC): @abstractmethod async def resize(self, width: int, height: int) -> dict[str, Any]: pass @abstractmethod async def navigate(self, url: str) -> dict[str, Any]: pass class PlaywrightOperator(BrowserAutomation): # Current implementation ``` This creates a clear contract and makes the system boundary explicit. ### 3. Adapter Pattern at System Boundary ```python # Thin adapter that only wraps Playwright API calls class PlaywrightAdapter: async def set_viewport_size(self, page, dimensions): await page.set_viewport_size(dimensions) async def navigate_to(self, page, url): await page.goto(url) # Our business logic uses the adapter class PlaywrightOperator: def __init__(self, adapter=None): self._adapter = adapter or PlaywrightAdapter() ``` Tests can inject a mock adapter with only 1-2 methods to verify. ## Best Testing Architecture (No Rules Constraints) If I could design the ideal testing structure without external constraints: ### 1. Three-Layer Testing Strategy **Unit Tests**: Mock at the adapter boundary ```python async def test_resize_calls_adapter_correctly(): mock_adapter = Mock() operator = PlaywrightOperator(adapter=mock_adapter) await operator.resize(1920, 1080) mock_adapter.set_viewport_size.assert_called_once_with( ANY, # page object {"width": 1920, "height": 1080} ) ``` **Contract Tests**: Verify adapter behavior with real Playwright ```python @pytest.mark.integration async def test_adapter_viewport_resize_contract(): # Use real Playwright but in isolated test async with async_playwright() as p: browser = await p.chromium.launch() page = await browser.new_page() adapter = PlaywrightAdapter() await adapter.set_viewport_size(page, {"width": 800, "height": 600}) # Verify through Playwright's API viewport = page.viewport_size assert viewport["width"] == 800 assert viewport["height"] == 600 ``` **E2E Tests**: Full stack with real browser (sparingly) ```python @pytest.mark.e2e async def test_mcp_browser_resize_full_stack(): # Real browser, real MCP server async with TestMCPServer() as server: result = await server.call_tool("browser_resize", {"width": 1920, "height": 1080}) assert result["status"] == "success" # Take screenshot and verify dimensions screenshot = await server.call_tool("browser_screenshot") assert analyze_image_dimensions(screenshot) == (1920, 1080) ``` ### 2. Test Data Builders ```python class PlaywrightTestBuilder: def with_page(self, url="https://example.com"): self.page = Mock(url=url) return self def with_viewport_size(self, width, height): self.page.viewport_size = {"width": width, "height": height} return self def build(self): return self.page ``` ### 3. Behavior Specifications ```python class TestBrowserResize: """Browser resize should...""" async def test_change_viewport_to_requested_dimensions(self): # Given a browser at default size page = PlaywrightTestBuilder().with_viewport_size(1024, 768).build() # When resizing to new dimensions result = await resize_viewport(page, 1920, 1080) # Then viewport matches requested size assert result.confirms_viewport_size(1920, 1080) ``` ## Conclusion The current architecture makes meaningful testing difficult because: 1. Direct instantiation prevents dependency injection 2. No clear abstraction boundary between our code and Playwright 3. The actual system boundary (Playwright API) is too complex to mock reasonably The best solution is to introduce a thin adapter layer at the true system boundary, allowing us to: - Mock a simple interface in unit tests - Test the adapter contract separately - Keep our business logic (PlaywrightOperator) testable - Follow all testing guidelines while maintaining meaningful tests This refactoring would make the codebase more testable without compromising functionality or adding significant complexity.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Uninen/devserver-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

playwright_testing_challenges.md•7.68 KiB