Skip to main content
Glama

AI Vision MCP Server

可视化 UI 调试代理 MCP

自主调试 MCP 服务器,使 AI 模型能够通过 Playwright 分析、调试并与 Web 界面交互。该服务器使任何 AI 模型(即使没有内置视觉功能的模型)能够直观地检查网页、查找 UI 错误、测试用户工作流程并验证应用程序性能——所有这些都无需人工干预。

UI自动化截图

自主 UI 调试代理

该 MCP 服务器可充当 AI 驱动的自主调试代理,可以:

  • 对 Web 应用程序进行全面的可视化分析
  • 通过检查视觉元素及其属性来检测 UI 问题
  • **自动测试常见的用户工作流程,**无需手动创建测试脚本
  • 验证 API 端点并验证后端响应
  • 跟踪应用程序版本之间的视觉变化
  • 监视控制台日志中的错误和警告
  • 分析性能指标以识别瓶颈
  • 生成包含屏幕截图和建议的详细报告

该服务器设计为智能工作,重复使用浏览器会话,避免不必要的文件创建,并专注于应用程序的最重要方面。

安装选项

使用 MCP 网关(推荐)

安装此 MCP 服务器的最简单方法是通过任何兼容 MCP 的网关:

# Example with Claude gateway claude-gateway install visual-ui-debug-agent-mcp

快速安装脚本

使用我们的单行安装脚本:

curl -s https://raw.githubusercontent.com/samihalawa/visual-ui-debug-agent-mcp/main/scripts/install-global.sh | bash

NPM 安装

通过 npm 进行全局安装:

# Install globally npm install -g visual-ui-debug-agent-mcp # Start the server visual-ui-debug-agent-mcp

Docker Hub 安装

对于容器化部署:

# Pull the image from Docker Hub docker pull samihalawa/visual-ui-debug-agent-mcp:latest # Run the container docker run -p 8080:8080 samihalawa/visual-ui-debug-agent-mcp:latest

Smithery 集成

此软件包使用所包含的配置文件与 Smithery 完全兼容:

# Install with Smithery smithery install visual-ui-debug-agent-mcp # Or run with your API key npm run smithery:key YOUR_SMITHERY_API_KEY

有关完整的安装和使用说明,请参阅Smithery 集成指南

完整工具参考

主要视觉分析工具

1. enhanced_page_analyzer

通过交互元素映射、性能指标和视觉检查对网页进行全面分析。

const analysis = await mcp.callTool("enhanced_page_analyzer", { url: "https://example.com/dashboard", includeConsole: true, mapElements: true, fullPage: true });
2. ui_workflow_validator

通过执行和验证一系列 UI 交互来自动测试完整的用户旅程。

const result = await mcp.callTool("ui_workflow_validator", { startUrl: "https://example.com/login", taskDescription: "User login flow", steps: [ { description: "Enter username", action: "fill", selector: "#username", value: "test" }, { description: "Enter password", action: "fill", selector: "#password", value: "pass" }, { description: "Click login", action: "click", selector: "button[type='submit']" }, { description: "Verify dashboard loads", action: "verifyElementVisible", selector: ".dashboard" } ], captureScreenshots: "all" });
3. visual_comparison 👁️

比较两个网页或 UI 状态以识别视觉差异。

const diff = await mcp.callTool("visual_comparison", { url1: "https://example.com/before", url2: "https://example.com/after", threshold: 0.05 });
4. screenshot_url 📸

捕获任何 URL 的高质量屏幕截图,并提供整页或特定元素的选项。

const screenshot = await mcp.callTool("screenshot_url", { url: "https://example.com/profile", fullPage: true, device: "iPhone 13" });
5. batch_screenshot_urls 📷

通过一次操作截取多个 URL 的屏幕截图,以便进行有效比较。

const screenshots = await mcp.callTool("batch_screenshot_urls", { urls: ["https://example.com/page1", "https://example.com/page2"], fullPage: true });

用户流测试工具

6. navigation_flow_validator

通过验证测试多步导航序列。

const navResult = await mcp.callTool("navigation_flow_validator", { startUrl: "https://example.com", steps: [ { action: "click", selector: "a.products" }, { action: "wait", waitTime: 1000 }, { action: "click", selector: ".product-item" } ], captureScreenshots: true });
7. api_endpoint_tester

测试多个 API 端点并验证后端验证的响应。

const apiTest = await mcp.callTool("api_endpoint_tester", { url: "https://api.example.com/v1", endpoints: [ { path: "/users", method: "GET" }, { path: "/products", method: "GET" } ], authToken: "Bearer token123" });

DOM 和性能分析

8. dom_inspector

详细检查 DOM 元素及其属性。

const elementInfo = await mcp.callTool("dom_inspector", { url: "https://example.com", selector: "nav.main-menu", includeChildren: true, includeStyles: true });
9. console_monitor 📟

监视并捕获控制台日志以检测错误。

const logs = await mcp.callTool("console_monitor", { url: "https://example.com/app", filterTypes: ["error", "warning"], duration: 5000 });
10. performance_analysis

测量并分析页面加载性能指标。

const perfMetrics = await mcp.callTool("performance_analysis", { url: "https://example.com/dashboard", iterations: 3 });

低级剧作家控制

11. screenshot_local_files本地文件

截取本地 HTML 文件的屏幕截图。

const localScreenshot = await mcp.callTool("screenshot_local_files", { filePath: "/path/to/local/file.html" });
12. 直接剧作家行动

完整的低级 Playwright 控件集,可实现精确的自动化:

  • playwright_navigate :导航到特定的 URL
  • playwright_click :点击元素
  • playwright_iframe_click :点击 iframe 内的元素
  • playwright_fill :填写表单字段
  • playwright_select :选择下拉选项
  • playwright_hover :将鼠标悬停在元素上
  • playwright_evaluate :在页面上下文中运行 JavaScript
  • playwright_console_logs :获取控制台日志
  • playwright_get_visible_text :提取可见文本
  • playwright_get_visible_html : 获取可见的 HTML
  • playwright_go_back :向后导航
  • playwright_go_forward :向前导航
  • playwright_press_key :按下键盘键
  • playwright_drag :拖放元素
  • playwright_screenshot :截取自定义屏幕截图

自主调试工作流程

MCP 服务器可以通过组合工具自主执行完整的调试工作流程。例如:

视觉回归测试

// 1. Analyze the current version const currentAnalysis = await mcp.callTool("enhanced_page_analyzer", {...}); // 2. Compare with previous version const comparisonResult = await mcp.callTool("visual_comparison", {...}); // 3. Generate visual difference report const report = await mcp.callTool("ui_workflow_validator", {...});

端到端用户流验证

// 1. Start with login flow const loginResult = await mcp.callTool("ui_workflow_validator", {...}); // 2. Validate core features const featureResults = await mcp.callTool("navigation_flow_validator", {...}); // 3. Test API endpoints const apiResults = await mcp.callTool("api_endpoint_tester", {...});

性能优化

// 1. Analyze initial performance const initialPerformance = await mcp.callTool("performance_analysis", {...}); // 2. Identify slow-loading elements const elementPerformance = await mcp.callTool("dom_inspector", {...}); // 3. Monitor console for errors const consoleErrors = await mcp.callTool("console_monitor", {...});

可视化分析示例

元素映射

元素映射

MCP 服务器自动映射页面上的所有交互元素,使 AI 模型能够轻松理解 UI 结构。

视觉比较

视觉比较

视觉比较工具突出显示 UI 状态之间的差异,非常适合捕捉意外的视觉变化。

集成选项

与 Smithery 集成

# smithery.yaml configuration startCommand: type: stdio configSchema: type: object properties: port: type: number description: Port number for the MCP server debug: type: boolean description: Enable debug mode

与 GLAMA 集成

// glama.json configuration { "name": "visual-ui-debug-agent-mcp", "version": "1.0.2", "settings": { "port": 8080, "headless": true, "maxConcurrentSessions": 5 } }

与非视觉模型的集成

MCP 服务器将视觉信息转换为结构化数据,可供任何 AI 模型使用,即使是没有视觉能力的模型:

// The model receives structured data about visual elements { "interactiveElements": [ { "tagName": "button", "text": "Submit", "bounds": {"x": 120, "y": 240, "width": 100, "height": 40}, "visible": true }, // More elements... ] }

CI/CD 集成

此 MCP 服务器包括用于持续集成和部署的 GitHub Actions 工作流:

  • 构建和测试:验证代码质量
  • NPM Publishing :自动化包发布
  • Docker 发布:创建并推送 Docker 镜像
  • Smithery Publishing :部署到 Smithery 平台

执照

该项目已获得ISC 许可

-
security - not tested
F
license - not found
-
quality - not tested

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

为 Claude 和其他兼容 MCP 的 AI 助手提供 AI 驱动的视觉分析功能,使它们能够捕获和分析屏幕截图、执行文件操作以及生成 UI/UX 报告。

  1. 自主 UI 调试代理
    1. 安装选项
      1. 使用 MCP 网关(推荐)
      2. 快速安装脚本
      3. NPM 安装
      4. Docker Hub 安装
      5. Smithery 集成
    2. 完整工具参考
      1. 主要视觉分析工具
      2. 用户流测试工具
      3. DOM 和性能分析
      4. 低级剧作家控制
    3. 自主调试工作流程
      1. 视觉回归测试
      2. 端到端用户流验证
      3. 性能优化
    4. 可视化分析示例
      1. 元素映射
      2. 视觉比较
    5. 集成选项
      1. 与 Smithery 集成
      2. 与 GLAMA 集成
      3. 与非视觉模型的集成
    6. CI/CD 集成
      1. 执照

        Related MCP Servers

        • A
          security
          A
          license
          A
          quality
          A custom MCP tool that integrates Perplexity AI's API with Claude Desktop, allowing Claude to perform web-based research and provide answers with citations.
          Last updated -
          1
          2
          JavaScript
          MIT License
          • Apple
        • -
          security
          F
          license
          -
          quality
          Enables AI tools to capture and process screenshots of a user's screen, allowing AI assistants to see and analyze what the user is looking at through a simple MCP interface.
          Last updated -
          1
          Python
          • Linux
          • Apple
        • -
          security
          A
          license
          -
          quality
          An MCP server that bridges AI agents with GUI automation capabilities, allowing them to control mouse, keyboard, windows, and take screenshots to interact with desktop applications.
          Last updated -
          Python
          MIT License
          • Apple
          • Linux
        • A
          security
          F
          license
          A
          quality
          An MCP server that supercharges AI assistants with powerful tools for software development, enabling research, planning, code generation, and project scaffolding through natural language interaction.
          Last updated -
          11
          40
          TypeScript
          • Linux
          • Apple

        View all related MCP servers

        MCP directory API

        We provide all the information about MCP servers via our MCP API.

        curl -X GET 'https://glama.ai/api/mcp/v1/servers/samihalawa/mcp-ai-vision-debug-ui-automation'

        If you have feedback or need assistance with the MCP directory API, please join our Discord server