Integrations
Runs on Node.js 14+ as the server environment required for operating the MCP functionality
AI Vision MCP Server
A Model Context Protocol (MCP) server that provides AI-powered visual analysis capabilities for Claude and other MCP-compatible AI assistants.
Features
- Screenshot URL: Capture screenshots of any website by providing a URL
- Visual Analysis: Analyze UI elements, layouts, and content in screenshots
- File Operations: Read and modify files with line-specific precision
- Report Generation: Create comprehensive UI/UX analysis reports
- Debugging Session: Maintain context across multiple analysis steps
Installation
Usage
Starting the Server
Configuration
Add the server to your MCP configuration:
Available Tools
screenshot_url
Take a screenshot of a URL using a web browser.
Parameters:
url
(string, required): URL to capture a screenshot of (e.g., http://localhost:4999, https://google.com)fullPage
(boolean, optional): Whether to capture full page or just viewport. Default: falsewaitForSelector
(string, optional): CSS selector to wait for before taking screenshotwaitTime
(number, optional): Time to wait in milliseconds before taking screenshot. Default: 1000
analyze_screen
Analyze a screenshot with AI vision.
Parameters: None (uses the most recent screenshot)
read_file
Read content from a file between specified line numbers.
Parameters:
path
(string): Path to the filestartLine
(number): Starting line number (1-indexed)endLine
(number): Ending line number (1-indexed)
modify_file
Modify content in a file between specified line numbers.
Parameters:
path
(string): Path to the filestartLine
(number): Starting line number to replace (1-indexed)endLine
(number): Ending line number to replace (1-indexed)content
(string): New content to replace the specified lines
generate_report
Generate a comprehensive UI/UX analysis report.
Parameters:
testUrl
(string): URL of the application being testedappName
(string, optional): Name of the application being analyzeddate
(string, optional): Date of the analysis (YYYY-MM-DD)observations
(object): Observations structured as components, data state, interactions, etc.
Example Workflow
- Take a screenshot of a website:Copy
- Analyze the screenshot:Copy
- Generate a report based on the analysis:Copy
Requirements
- Node.js 14+
- Playwright for browser automation
- Gemini API key for AI vision analysis
License
MIT
This server cannot be installed
Provides AI-powered visual analysis capabilities for Claude and other MCP-compatible AI assistants, allowing them to capture and analyze screenshots, perform file operations, and generate UI/UX reports.
Related MCP Servers
- AsecurityAlicenseAqualityA custom MCP tool that integrates Perplexity AI's API with Claude Desktop, allowing Claude to perform web-based research and provide answers with citations.Last updated -12JavaScriptMIT License
- -securityFlicense-qualityEnables AI tools to capture and process screenshots of a user's screen, allowing AI assistants to see and analyze what the user is looking at through a simple MCP interface.Last updated -1Python
- -securityAlicense-qualityAn MCP server that bridges AI agents with GUI automation capabilities, allowing them to control mouse, keyboard, windows, and take screenshots to interact with desktop applications.Last updated -PythonMIT License
- AsecurityAlicenseAqualityAn MCP server that supercharges AI assistants with powerful tools for software development, enabling research, planning, code generation, and project scaffolding through natural language interaction.Last updated -116TypeScriptMIT License