Playwright MCP Server

plan.md•9.32 kB

# MCP Playwright Server Implementation Plan ## Overview Create an MCP server that connects to a remote browserless/playwright instance, providing web automation tools with configurable endpoints and reasonable defaults. ## Project Structure ``` playwright-mcp/ ├── package.json ├── tsconfig.json ├── src/ │ ├── index.ts # Main server entry point │ ├── server.ts # MCP server implementation │ ├── playwright.ts # Playwright connection and utilities │ └── types.ts # TypeScript type definitions └── README.md ``` ## Dependencies - `@modelcontextprotocol/sdk` - Official MCP TypeScript SDK - `playwright` - Browser automation library - `zod` - Input validation and schema definition - `typescript` & `@types/node` - TypeScript support ## Configuration System - Accept playwright URL via command line argument - Default to `ws://localhost:3000/playwright/chromium` - Support environment variable override: `PLAYWRIGHT_URL` - Validate URL format and connectivity on startup ## MCP Tools to Implement ### Core Tools (Implemented - Microsoft Standard Names) 1. `browser_navigate` (renamed from `navigate`) - **Purpose**: Navigate to a URL - **Inputs**: `url` (string), optional `waitUntil` (networkidle, domcontentloaded, load) - **Returns**: Success status and final URL 2. `browser_take_screenshot` (renamed from `screenshot`) - **Purpose**: Take a screenshot of current page - **Inputs**: optional `fullPage` (boolean), `selector` (string) - **Returns**: Base64 encoded PNG data - **Defaults**: fullPage=false, viewport screenshot 3. `browser_get_html` (renamed from `get_html`) - **Purpose**: Extract HTML content - **Inputs**: optional `selector` (string for specific element) - **Returns**: HTML string - **Defaults**: Full page HTML 4. `browser_click` (renamed from `click`) - **Purpose**: Click an element - **Inputs**: `selector` (string) - **Returns**: Success status 5. `browser_type` (renamed from `type_text`) - **Purpose**: Type text into an input field - **Inputs**: `selector` (string), `text` (string) - **Returns**: Success status 6. `browser_wait_for` (renamed from `wait_for_element`) - **Purpose**: Wait for element to appear - **Inputs**: `selector` (string), optional `timeout` (number, default 30000ms) - **Returns**: Success status 7. `browser_evaluate` (renamed from `evaluate`) - **Purpose**: Execute JavaScript in browser context - **Inputs**: `script` (string) - **Returns**: Script execution result ### High Priority Missing Tools (Based on Microsoft Analysis) 8. `browser_snapshot` ✅ **IMPLEMENTED** - **Purpose**: Get accessibility tree snapshot for LLM-friendly element identification - **Inputs**: optional `selector` (string for specific element) - **Returns**: Structured accessibility data with element references - **Why needed**: Replaces fragile CSS selectors with semantic descriptions - **Status**: Fully implemented and tested. Provides detailed accessibility tree information including roles, names, states, and CSS selectors for better element identification. 9. `browser_file_upload` ✅ **IMPLEMENTED** - **Purpose**: Upload files to file input elements - **Inputs**: `selector` (string), `paths` (array of file paths) - **Returns**: Success status and file count - **Why needed**: Critical for testing file upload functionality - completely missing - **Status**: Fully implemented with comprehensive file validation, absolute/relative path support, and proper error handling. 10. `browser_handle_dialog` - **Purpose**: Handle JavaScript dialogs (alerts, confirms, prompts) - **Inputs**: `accept` (boolean), optional `promptText` (string) - **Returns**: Dialog text and action result - **Why needed**: Prevents test hangs from unexpected dialogs 11. `browser_navigate_back` - **Purpose**: Navigate back in browser history - **Inputs**: None - **Returns**: Success status and new URL - **Why needed**: Essential for testing multi-step workflows 12. `browser_navigate_forward` - **Purpose**: Navigate forward in browser history - **Inputs**: None - **Returns**: Success status and new URL - **Why needed**: Complete browser navigation testing 13. `browser_refresh` ✅ **IMPLEMENTED** - **Purpose**: Refresh the current page - **Inputs**: optional `waitUntil` (string), optional `timeout` (number) - **Returns**: Success status and current URL - **Why needed**: Test page refresh behavior, form resubmission scenarios - **Status**: Fully implemented with configurable wait conditions and timeout support. Handles redirects and provides URL feedback. ### Medium Priority Missing Tools 14. `browser_hover` - **Purpose**: Hover mouse over element - **Inputs**: `selector` (string) - **Returns**: Success status - **Why needed**: Test hover menus, tooltips, hover states 15. `browser_drag` - **Purpose**: Drag and drop between elements - **Inputs**: `startSelector` (string), `endSelector` (string) - **Returns**: Success status - **Why needed**: Test drag-drop file uploads, sortable lists, Kanban boards 16. `browser_select_option` - **Purpose**: Select option from dropdown/select elements - **Inputs**: `selector` (string), `value` (string) or `label` (string) - **Returns**: Success status - **Why needed**: Complete form interaction beyond basic text input 17. `browser_check` - **Purpose**: Check checkbox or radio button - **Inputs**: `selector` (string) - **Returns**: Success status - **Why needed**: Form completion for checkboxes and radio buttons 18. `browser_uncheck` - **Purpose**: Uncheck checkbox - **Inputs**: `selector` (string) - **Returns**: Success status - **Why needed**: Toggle checkbox states in forms 19. `browser_fill` - **Purpose**: Fill multiple form fields at once - **Inputs**: `fields` (object with selector-value pairs) - **Returns**: Success status - **Why needed**: Efficient form completion vs individual type_text calls ### Low Priority Missing Tools 20. `browser_console_messages` - **Purpose**: Get browser console messages - **Inputs**: None - **Returns**: Array of console messages - **Why needed**: Debug JavaScript errors and application logs 21. `browser_pdf_save` - **Purpose**: Save current page as PDF - **Inputs**: optional `filename` (string) - **Returns**: PDF data or file path - **Why needed**: Generate test reports, test PDF export features 22. `browser_download` - **Purpose**: Handle file downloads - **Inputs**: optional `downloadPath` (string) - **Returns**: Download status and file path - **Why needed**: Test file download functionality 23. `browser_clear` - **Purpose**: Clear input field content - **Inputs**: `selector` (string) - **Returns**: Success status - **Why needed**: Reset form fields before new input 24. `browser_focus` - **Purpose**: Focus on specific element - **Inputs**: `selector` (string) - **Returns**: Success status - **Why needed**: Test focus management, keyboard navigation 25. `browser_close` - **Purpose**: Close the browser/page - **Inputs**: None - **Returns**: Success status - **Why needed**: Clean shutdown of browser sessions ## Implementation Details ### Error Handling - Graceful connection failures to playwright endpoint - Timeout handling for all operations - Clear error messages for invalid selectors/operations - Automatic retry logic for connection issues ### Browser Management - Single browser instance with page pooling - Automatic cleanup of resources - Configurable viewport size (default: 1920x1080) - User agent configuration ### Security Considerations - Validate and sanitize all user inputs - Restrict JavaScript execution in `evaluate` tool - Timeout limits on all operations - URL allowlist option for production use ### File Handling - Support for absolute file paths in upload/download operations - Temporary file management for downloads - File type validation for uploads - Size limits for file operations ## Development Steps 1. ✅ Set up TypeScript project with dependencies 2. ✅ Implement basic MCP server structure 3. ✅ Add playwright connection management 4. ✅ Implement core tools (navigate, screenshot, get_html) 5. ✅ Add interactive tools (click, type_text, wait_for_element) 6. ✅ Implement evaluate tool with safety measures 7. ✅ Add comprehensive error handling 8. ✅ Create configuration system 9. ✅ Write documentation and examples 10. ✅ Add tests and validation ### Next Phase: Missing Tools Implementation 11. ✅ Implement browser_snapshot for accessibility tree 12. Add browser_file_upload for file testing 13. Implement browser_handle_dialog for dialog handling 14. Add browser navigation tools (back/forward/refresh) 15. Implement advanced element interactions (hover, drag, select) 16. Add debugging tools (console messages, PDF generation) 17. Implement form utilities (fill, clear, check/uncheck) 18. Add file download handling 19. Create comprehensive test suite for new tools 20. Update documentation with new capabilities

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/b3nw/playwright-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server