CLAUDE.md•13.6 kB
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Overview
This is a Model Context Protocol (MCP) server that provides browser automation capabilities using Playwright. It enables LLMs to interact with web pages, take screenshots, execute JavaScript, and perform HTTP requests through a set of MCP tools.
Server name: `playwright-mcp` (important for tool name length constraints - some clients like Cursor have a 60-character limit for `server_name:tool_name`)
## Command Line Options
### Session Persistence
**By default**, browser session data (cookies, localStorage, sessionStorage) is automatically saved and persists across browser restarts.
**Available flags:**
- `--no-save-session` - Disable session persistence (start fresh each time)
- `--user-data-dir <path>` - Customize where session data is stored (default: `./.mcp-web-inspector`)
**Examples:**
```bash
# Default behavior - sessions saved in ./.mcp-web-inspector
mcp-web-inspector
# Disable session persistence
mcp-web-inspector --no-save-session
# Custom session directory
mcp-web-inspector --user-data-dir ./my-custom-sessions
# Both flags together
mcp-web-inspector --user-data-dir /tmp/browser-sessions
```
### Default Behavior
Data is organized in `./.mcp-web-inspector/`:
```
.mcp-web-inspector/
├── user-data/ # Browser sessions (cookies, localStorage, sessionStorage)
└── screenshots/ # Screenshot files
```
- Browser sessions persist across restarts
- Screenshots saved to dedicated directory
- No configuration needed - just works out of the box
### Clearing Saved Data
```bash
# Clear all data (default directory)
rm -rf ./.mcp-web-inspector
# Clear only sessions
rm -rf ./.mcp-web-inspector/user-data
# Clear only screenshots
rm -rf ./.mcp-web-inspector/screenshots
# Custom directory
rm -rf ./my-custom-sessions
```
### Security Best Practices
**⚠️ IMPORTANT**: Always add `.mcp-web-inspector/` to `.gitignore` to prevent committing:
- Browser session data (cookies, localStorage, sessionStorage)
- Saved screenshots (may contain sensitive information)
- Authentication tokens and credentials
**Why this matters:**
- Session data contains cookies and authentication tokens
- Screenshots may capture sensitive user data
- Committing this data could leak credentials to the repository
- Session files can bloat git history
**Recommended `.gitignore` entry:**
```gitignore
# MCP Web Inspector data
.mcp-web-inspector/
```
## Development Commands
### Building
```bash
npm run build # Compile TypeScript to dist/
npm run watch # Watch mode for development
npm run prepare # Build and make dist files executable (runs on npm install)
```
### Testing
```bash
npm test # Run tests without coverage
npm run test:coverage # Run tests with coverage report (outputs to coverage/)
npm run test:custom # Run tests using custom script (node run-tests.cjs)
node run-tests.cjs # Alternative way to run tests with coverage
```
### Docs
```bash
npm run generate:readme # Regenerate README tool docs from code metadata
```
- README tool descriptions are generated by `generate:readme`. Do not hand‑edit those sections in `README.md` — update tool metadata (`getMetadata()` in the tool classes) and re‑run the generator instead.
- A `prepublishOnly` script checks that `README.md` is up to date and will fail if regeneration is needed.
## Available Tools
**See `src/tools/common/registry.ts` for complete tool definitions and up-to-date descriptions.**
### Tool Categories
**Inspection** (10 tools):
- `inspect_dom`, `inspect_ancestors`, `compare_element_alignment`, `get_computed_styles`, `check_visibility`, `query_selector`, `get_test_ids`, `measure_element`, `find_by_text`, `element_exists`
**Navigation** (4 tools):
- `go_history`, `navigate`, `scroll_by`, `scroll_to_element`
**Interaction** (7 tools):
- `click`, `drag`, `fill`, `hover`, `press_key`, `select`, `upload_file`
**Content** (3 tools):
- `get_html`, `get_text`, `visual_screenshot_for_humans`
**Console** (2 tools):
- `clear_console_logs`, `get_console_logs`
**Evaluation** (1 tools):
- `evaluate`
**Network** (2 tools):
- `get_request_details`, `list_network_requests`
**Waiting** (2 tools):
- `wait_for_element`, `wait_for_network_idle`
**Lifecycle** (2 tools):
- `close`, `set_color_scheme`
**Other** (1 tools):
- `confirm_output`
## Architecture
### Server Structure (MCP Protocol)
The server follows the Model Context Protocol specification:
1. **Entry Point** (`src/index.ts`): Sets up MCP server with stdio transport, creates tool definitions, and configures request handlers with graceful shutdown logic.
2. **Request Handler** (`src/requestHandler.ts`): Implements MCP protocol handlers:
- `ListResourcesRequestSchema`: Exposes console logs and screenshots as resources
- `ReadResourceRequestSchema`: Retrieves resource contents
- `ListToolsRequestSchema`: Returns available tools
- `CallToolRequestSchema`: Delegates to `handleToolCall` for execution
3. **Tool Handler** (`src/toolHandler.ts`): Central dispatcher for tool execution
- Manages global browser state (browser, page, currentBrowserType)
- Implements `ensureBrowser()` to lazily launch browsers with reconnection handling
- Routes tool calls to appropriate tool instances
- Contains browser cleanup and error recovery logic
### Tool Organization
All tools are registered through `src/tools/common/registry.ts` with implementations in `src/tools/browser/`:
- **Browser tool registry**: Use `getBrowserToolNames()` to list all tools requiring a Playwright browser instance (navigate, click, visual_screenshot_for_humans, inspect_dom, etc.)
All browser tools extend `BrowserToolBase` (`src/tools/browser/base.ts`) which provides:
- `safeExecute()`: Wrapper with browser connection validation and error handling
- `ensurePage()` and `validatePageAvailable()`: Page availability checks
- `normalizeSelector()`: Converts test ID shortcuts to full selectors (e.g., `testid:foo` → `[data-testid="foo"]`)
- `selectPreferredLocator()`: **STANDARD** - Selects elements with visibility preference (see Element Selection below)
- `formatElementSelectionInfo()`: Formats warnings for duplicate selectors with testid tip
- Automatic browser state reset on disconnection errors
**Tool parameters documented in `src/tools/common/registry.ts`**
### Tool Implementation Pattern
Tools follow a consistent pattern implementing the `ToolHandler` interface (`src/tools/common/types.ts`):
```typescript
export interface ToolHandler {
execute(args: any, context: ToolContext): Promise<ToolResponse>;
}
```
The `ToolContext` provides browser, page, apiContext, and server references. Tools return standardized `ToolResponse` objects with content arrays and error flags.
### Browser Lifecycle
Browser instances are managed globally in `toolHandler.ts`:
- Browsers launch on first tool use (lazy initialization)
- Support for chromium, firefox, and webkit engines
- Automatic reconnection handling for disconnected browsers
- Console message registration via `registerConsoleMessage()` for all page events
- Clean shutdown on SIGINT/SIGTERM signals
The `ensureBrowser()` function handles:
- Browser disconnection detection and cleanup
- Closed page recovery (creates new page)
- Browser type switching (e.g., chromium → firefox)
- Viewport, user agent, and headless mode configuration
- Retry logic for initialization failures
### File Structure
```
src/
├── index.ts # MCP server entry point
├── requestHandler.ts # MCP protocol handlers
├── toolHandler.ts # Tool execution dispatcher, browser state
├── tools/
│ ├── common/registry.ts # Tool registry, definitions, and helpers
├── tools/
│ ├── common/types.ts # Shared interfaces (ToolContext, ToolResponse)
│ └── browser/ # Browser automation tools
│ ├── base.ts # BrowserToolBase class with normalizeSelector()
│ ├── interaction.ts # Click, fill, select, hover, etc.
│ ├── navigation.ts # Navigate, go back/forward
│ ├── screenshot.ts # Screenshot tool
│ ├── console.ts # Console logs retrieval
│ ├── visiblePage.ts # Get visible text/HTML
│ ├── output.ts # PDF generation
│ ├── elementInspection.ts # Element visibility and position tools
│ └── ...
└── __tests__/ # Jest test suites
└── tools/
└── browser/
├── elementInspection.test.ts # Tests for visibility & position tools
└── ...
```
## Testing Notes
- Tests use Jest with ts-jest preset
- Test environment: Node.js
- Coverage excludes `src/index.ts`
- Tests are in `src/__tests__/**/*.test.ts`
- Module name mapper handles `.js` imports in ESM context
- Uses `tsconfig.test.json` for test-specific TypeScript config
## Important Constraints
1. **Tool Naming**: Keep tool names short - some clients (Cursor) limit `server_name:tool_name` to 60 characters. Server name is `playwright-mcp`.
2. **Browser State**: Browser and page instances are global singletons. Browser type changes (chromium/firefox/webkit) force browser restart.
3. **Error Handling**: Tools must handle browser disconnection gracefully. The `BrowserToolBase.safeExecute()` method automatically resets state on connection errors.
4. **ESM Modules**: This project uses ES modules (type: "module" in package.json). Import statements must include `.js` extensions even for `.ts` files.
5. **Console Logging**: Page console messages, exceptions, and unhandled promise rejections are captured via `registerConsoleMessage()` and stored in `ConsoleLogsTool`.
### Parameter Discipline (Important)
- Prefer sensible defaults and internal heuristics over adding new tool parameters.
- Keep inputs minimal (aim 1–3). Avoid schema churn; parameter changes impact clients and tests.
- Only add parameters when strictly necessary; justify in the PR and keep them primitive types.
- If added, update `getMetadata()` descriptions precisely and regenerate README via `generate:readme`.
- Consider session flags or server config for global behavior instead of per-call params.
## Console Logs Tool — Planned Changes (Token Efficiency)
Backward compatibility is not required now; optimize for simplicity and token-efficiency.
- Separate clearer tool
- Add `clear_console_logs` (atomic, no params). Returns the number of entries cleared.
- Remove the `clear` parameter from `get_console_logs`.
- Safer defaults
- Default `limit = 20`.
- Default `since = 'last-interaction'`.
- Output format
- `format: 'grouped'` (default): deduplicate identical logs by `(type + text + source when available)`. One line per group with count, e.g., `[log] Message (× 47)`.
- `format: 'raw'`: chronological, no grouping — for order/race-condition debugging.
- Grouping semantics
- Group globally across the filtered result set (not consecutive-only). Sort groups by first occurrence time for stable ordering.
- Large-output guard (aligned with `get_html`)
- If output exceeds a preview threshold (e.g., ~2,000 chars) — especially in `format: 'raw'` — return preview + counts and a one-time `confirmToken`. Call again with the token to fetch the full payload.
- Always include: `totalMatched`, `shownCount`/`groupsShown`, `truncated` flag, and refinement tips (`search`, `type`, `since`, `limit`).
- Explicit metadata
- Update `getMetadata()` to enumerate all fields and conditional outputs (counts, truncation flags, confirmToken presence, and exact line formats for grouped).
Implementation checklist (future PR):
- Update `src/tools/browser/console/get_console_logs.ts` to support `format`, guard, new defaults, and drop `clear`.
- Add `src/tools/browser/console/clear_console_logs.ts` (or co-locate class) and register it.
- Extend tests in `src/tools/browser/console/__tests__/` for grouped/raw, defaults, and guard.
- Regenerate README via `npm run generate:readme`.
## Environment Variables
- `CHROME_EXECUTABLE_PATH`: Custom path to Chrome/Chromium executable (optional)
## Contributing Notes
When adding new tools:
1. Create a tool class in `src/tools/browser/` extending `BrowserToolBase`
2. Append the class to the `BROWSER_TOOL_CLASSES` array in `src/tools/browser/register.ts`
3. Update `handleToolCall()` in `src/toolHandler.ts` if the tool requires new context handling
4. Add tests in `src/__tests__/tools/browser/`
5. **Follow principles in `TOOL_DESIGN_PRINCIPLES.md`**
### Tool Design Principles (Summary)
**See `TOOL_DESIGN_PRINCIPLES.md` for comprehensive guidelines.**
** Backward compatability is not important at this stage **
Key principles:
- **Atomic operations** - One tool does ONE thing
- **Minimal parameters** - Aim for 1-3, max 5
- **Primitive types** - Use string/number/boolean over nested objects
- **Token-efficient output** - Use compact text format, not verbose JSON
- **Explicit documentation** - List ALL outputs in description, even conditional ones
**Critical Rule:** Tool descriptions MUST explicitly list all possible outputs. Don't use vague terms like "layout properties" - instead specify exactly what fields are returned (e.g., "width constraints (w, max-w, min-w), margins (↑↓←→), flexbox context when set"). See TOOL_DESIGN_PRINCIPLES.md §13 for examples.