Skip to main content
Glama
Leviathangk

Playwright MCP Server

by Leviathangk
design.md13.8 kB
# Design Document ## Overview Playwright MCP Server 是一个基于 Model Context Protocol 的浏览器自动化服务器,使用 Playwright 提供浏览器控制能力。服务器采用会话管理架构,支持多个并发的隔离会话,每个会话通过唯一的 sessionId 标识。所有会话共享单个 Browser 实例,通过 BrowserContext 实现隔离,确保资源高效利用的同时保证会话独立性。 服务器通过 stdio 与 MCP 客户端通信,暴露一组工具接口供 AI Agent 调用,实现网页导航、元素交互等自动化操作。 ## Architecture ### 系统架构图 ``` ┌─────────────────────────────────────────────────────────────┐ │ MCP Client (AI) │ └───────────────────────────┬─────────────────────────────────┘ │ stdio (JSON-RPC) ┌───────────────────────────▼─────────────────────────────────┐ │ MCP Server Layer │ │ ┌────────────────────────────────────────────────────────┐ │ │ │ Tool Handlers (create_session, navigate, click, etc.) │ │ │ └────────────────────┬───────────────────────────────────┘ │ └───────────────────────┼─────────────────────────────────────┘ │ ┌───────────────────────▼─────────────────────────────────────┐ │ Session Manager │ │ ┌──────────────────────────────────────────────────────┐ │ │ │ Sessions Map: sessionId -> SessionContext │ │ │ │ - context: BrowserContext │ │ │ │ - page: Page │ │ │ │ - createdAt: timestamp │ │ │ │ - expiresAt: timestamp │ │ │ │ - timeoutHandle: NodeJS.Timeout │ │ │ └──────────────────────────────────────────────────────┘ │ │ │ │ Configuration: │ │ - maxSessions: number │ │ - sessionTimeout: number (ms) │ └───────────────────────┬─────────────────────────────────────┘ │ ┌───────────────────────▼─────────────────────────────────────┐ │ Playwright Layer │ │ ┌──────────────────────────────────────────────────────┐ │ │ │ Shared Browser Instance (chromium/firefox/webkit) │ │ │ │ ├─ BrowserContext 1 -> Page 1 │ │ │ │ ├─ BrowserContext 2 -> Page 2 │ │ │ │ └─ BrowserContext N -> Page N │ │ │ └──────────────────────────────────────────────────────┘ │ └─────────────────────────────────────────────────────────────┘ ``` ### 核心组件 1. **MCP Server Layer**: 处理 MCP 协议通信,注册和分发工具调用 2. **Session Manager**: 管理会话生命周期、验证 sessionId、执行自动清理 3. **Playwright Layer**: 封装 Playwright API,提供浏览器操作能力 ## Components and Interfaces ### 1. Configuration Interface ```typescript interface ServerConfig { browser: 'chromium' | 'firefox' | 'webkit'; headless: boolean; sessionTimeout: number; // 毫秒 maxSessions: number; } ``` **默认值:** - `browser`: 'chromium' - `headless`: false - `sessionTimeout`: 300000 (5 分钟) - `maxSessions`: 10 ### 2. Session Context ```typescript interface SessionContext { sessionId: string; context: BrowserContext; page: Page; createdAt: number; expiresAt: number; timeoutHandle: NodeJS.Timeout; } ``` ### 3. Error Response ```typescript interface ErrorResponse { errorCode: string; message: string; sessionId?: string; details?: any; } ``` **错误码定义:** - `SESSION_NOT_FOUND`: 会话不存在 - `SESSION_EXPIRED`: 会话已过期 - `MAX_SESSIONS_REACHED`: 达到最大会话数限制 - `NAVIGATION_FAILED`: 导航失败 - `ELEMENT_NOT_FOUND`: 元素未找到 - `ELEMENT_NOT_CLICKABLE`: 元素不可点击 - `ELEMENT_NOT_EDITABLE`: 元素不可编辑 - `BROWSER_ERROR`: 浏览器层面错误 - `INVALID_PARAMETERS`: 参数无效 ### 4. Tool Interfaces #### create_session **输入:** ```typescript { // 当前版本无额外参数,使用服务器配置的默认超时 } ``` **输出:** ```typescript { sessionId: string; expiresAt: number; // Unix timestamp message: string; } ``` #### close_session **输入:** ```typescript { sessionId: string; } ``` **输出:** ```typescript { success: boolean; message: string; } ``` #### navigate **输入:** ```typescript { sessionId: string; url: string; waitUntil?: 'load' | 'domcontentloaded' | 'networkidle'; timeout?: number; } ``` **输出:** ```typescript { success: boolean; title: string; url: string; // 最终 URL(可能重定向) status: number; // HTTP 状态码 } ``` #### click **输入:** ```typescript { sessionId: string; selector: string; // CSS 或 XPath timeout?: number; force?: boolean; clickCount?: number; // 点击次数,默认 1 } ``` **输出:** ```typescript { success: boolean; message: string; } ``` #### type **输入:** ```typescript { sessionId: string; selector: string; text: string; delay?: number; // 每个字符之间的延迟(毫秒) timeout?: number; clear?: boolean; // 是否先清空输入框 } ``` **输出:** ```typescript { success: boolean; message: string; } ``` ## Data Models ### SessionManager Class ```typescript class SessionManager { private sessions: Map<string, SessionContext>; private browser: Browser | null; private config: ServerConfig; constructor(config: ServerConfig); async initialize(): Promise<void>; async createSession(): Promise<{ sessionId: string; expiresAt: number }>; async closeSession(sessionId: string): Promise<void>; getSession(sessionId: string): SessionContext | null; validateSession(sessionId: string): { valid: boolean; error?: ErrorResponse }; private scheduleCleanup(sessionId: string, timeout: number): void; private cleanupSession(sessionId: string): Promise<void>; async shutdown(): Promise<void>; } ``` **核心逻辑:** 1. **initialize()**: 启动共享的 Browser 实例 2. **createSession()**: - 检查是否达到 maxSessions 限制 - 生成 UUID v4 作为 sessionId - 创建 BrowserContext 和 Page - 设置过期时间和清理定时器 - 存储到 sessions Map 3. **validateSession()**: - 检查 sessionId 是否存在 - 检查是否已过期 - 返回验证结果和错误信息 4. **scheduleCleanup()**: 使用 setTimeout 安排自动清理 5. **cleanupSession()**: 关闭 Page、Context,清除定时器,从 Map 移除 ### Tool Handler Functions 每个工具处理函数遵循统一模式: ```typescript async function handleToolCall( sessionManager: SessionManager, args: any ): Promise<any> { // 1. 验证参数 // 2. 验证会话(如需要) // 3. 执行 Playwright 操作 // 4. 捕获错误并返回结构化响应 // 5. 返回成功结果 } ``` ## Error Handling ### 错误处理策略 1. **会话验证错误**:在每个需要 sessionId 的操作开始时验证 - 不存在:返回 `SESSION_NOT_FOUND` - 已过期:返回 `SESSION_EXPIRED` 2. **Playwright 操作错误**:使用 try-catch 捕获 - TimeoutError:转换为对应的错误码(如 `ELEMENT_NOT_FOUND`) - 其他错误:返回 `BROWSER_ERROR` 并包含原始错误信息 3. **资源限制错误**: - 达到最大会话数:返回 `MAX_SESSIONS_REACHED` 4. **参数验证错误**: - 缺少必需参数:返回 `INVALID_PARAMETERS` ### 错误响应格式 所有错误统一返回以下格式: ```typescript { isError: true, content: [{ type: "text", text: JSON.stringify({ errorCode: "ERROR_CODE", message: "Human readable message", sessionId: "optional-session-id", details: { /* optional additional context */ } }) }] } ``` ## Testing Strategy ### 测试框架 - **单元测试**: Jest - **属性测试**: fast-check (JavaScript/TypeScript 的 PBT 库) - **最小迭代次数**: 100 次 ### 测试分层 1. **SessionManager 单元测试** - 会话创建和销毁 - 过期时间计算 - 并发限制 - 自动清理机制 2. **Tool Handler 单元测试** - 参数验证 - 错误处理 - 成功路径 3. **集成测试** - 完整的会话生命周期 - 多会话并发 - 浏览器操作端到端测试 4. **属性测试** - 会话管理属性 - 错误处理一致性 ## Correctness Properties *A property is a characteristic or behavior that should hold true across all valid executions of a system-essentially, a formal statement about what the system should do. Properties serve as the bridge between human-readable specifications and machine-verifiable correctness guarantees.* ### Property 1: SessionId Uniqueness *For any* sequence of session creation calls, all returned sessionIds should be unique (no duplicates) **Validates: Requirements 1.1** ### Property 2: Session Expiration Time Calculation *For any* configured timeout value, when a session is created, the expiresAt timestamp should equal createdAt plus the timeout duration **Validates: Requirements 1.3** ### Property 3: Session Closure Removes SessionId *For any* valid session, after calling close_session, that sessionId should no longer exist in the session manager **Validates: Requirements 2.3** ### Property 4: Invalid SessionId Returns Consistent Error *For any* operation requiring a sessionId (navigate, click, type, close_session), providing a non-existent sessionId should return an error with errorCode "SESSION_NOT_FOUND" **Validates: Requirements 2.4, 3.3, 4.4, 5.4** ### Property 5: Navigation Returns Complete Response *For any* successful navigation operation, the response should contain title, url, and status fields **Validates: Requirements 3.2** ### Property 6: Successful Operations Return Success Field *For any* successful click or type operation, the response should contain a success field set to true **Validates: Requirements 4.2, 5.2** ### Property 7: Browser Type Configuration *For any* valid browser type parameter (chromium, firefox, webkit), the server should initialize with that browser type **Validates: Requirements 6.2** ### Property 8: Headless Mode Configuration *For any* boolean value provided for the headless parameter, the browser should launch in the corresponding mode **Validates: Requirements 6.4** ### Property 9: Session Timeout Configuration *For any* valid timeout value provided as a startup parameter, sessions should use that timeout duration **Validates: Requirements 6.6** ### Property 10: Max Sessions Configuration *For any* valid maxSessions value provided as a startup parameter, the server should enforce that limit **Validates: Requirements 6.8** ### Property 11: Shared Browser Instance *For any* number of sessions created, all should share the same Browser instance (verified by internal reference equality) **Validates: Requirements 7.2** ### Property 12: Session Isolation *For any* two concurrent sessions, setting a cookie or localStorage item in one session should not be visible in the other session **Validates: Requirements 7.5** ### Property 13: Error Response Structure *For any* operation that fails, the error response should contain both errorCode and message fields **Validates: Requirements 8.1, 8.2** ### Property 14: Session-Related Errors Include SessionId *For any* error related to a specific session, the error response should include the sessionId field **Validates: Requirements 8.3** ### Property 15: Distinct Error Codes for Different Failures *For any* two different error scenarios (e.g., session not found vs. element not found), the errorCode values should be different **Validates: Requirements 8.5**

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Leviathangk/PlaywrightMCPForCrawler'

If you have feedback or need assistance with the MCP directory API, please join our Discord server