Integrates with Gemini 3 Flash to provide agentic vision capabilities, enabling iterative screenshot analysis, change detection, and automated annotation for visual regression testing.
Provides tools for capturing and comparing screenshots from the iOS Simulator to facilitate visual regression testing on mobile layouts.
Enables visual regression testing by capturing and comparing screenshots directly from the macOS platform to detect and investigate UI changes.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Where's Waldo Rickcompare the current layout to the baseline and show me what changed"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Where's Waldo Rick - Visual Regression MCP Server
A Model Context Protocol (MCP) server that brings agentic vision capabilities to Claude Code for visual regression testing using Gemini 3 Flash.
Overview
Never again have ambiguous conversations about visual changes. See exactly what changed, circled and annotated, with intended vs unintended change detection.
Problem Solved
Developer works for hours on UI changes
Build passes, code is "clean"
You open the app... same exact layout
You ask: "What specifically changed?"
Dev says: "We added 2 pixels to the card"
You ask: "Where? Top? Bottom? Inside the box? Around it?"
π€ Wasted time, unclear communication
Solution
Where's Waldo Rick provides:
Screenshot capture from multiple platforms (macOS, iOS Simulator, Web)
Pixel-perfect comparison with configurable thresholds
Agentic vision analysis using Gemini 3 Flash (iterative zoom/crop/annotate)
Expected vs unintended change detection
Conversational investigation ("Not that box, the child item")
Installation
Requirements
Python 3.10+
Gemini API key (free tier: 15 requests/minute)
Install from GitHub
Configure Claude Code
Add to your Claude Code MCP configuration (~/.claude/mcp.json or project-specific):
Usage
Basic Workflow
MCP Tools
visual_capture
Capture a screenshot and store it for visual regression testing.
visual_prepare
Declare a baseline with expected changes before development.
visual_compare
Compare two screenshots with pixel-level precision and agentic vision.
visual_cleanup
Clean up old screenshots and cache.
Development
Setup
Project Structure
Roadmap
Phase 1: Foundation (MCP server skeleton, types, storage)
Phase 2: Capture & Baselines (multi-platform screenshots)
Phase 3: Comparison Engine (OpenCV + Gemini integration) π₯ HIGH RISK
Phase 4: Operations (caching, progressive resolution, reporting)
Phase 5: Polish (conversational investigation)
See ROADMAP.md for complete execution plan.
Contributing
Contributions welcome! Please read REQUIREMENTS.md and ROADMAP.md before contributing.
License
MIT License - See LICENSE file for details
Acknowledgments
Built with:
Generated with