agnt

Overview Schema Related Servers Score Discussions

agnt

VISUAL_REGRESSION_SPEC.md•10.4 KiB

# Visual Regression Testing - Implementation Spec ## Overview Add visual regression testing capability to agnt, allowing before/after screenshot comparison to verify only expected changes occurred during development. ## User Flow ### 1. Before Making Changes ```bash # Capture baseline snapshots agnt snapshot baseline --name "before-header-refactor" --url "http://localhost:3000" # Or capture multiple pages agnt snapshot baseline --name "before-auth-update" --urls " http://localhost:3000/login http://localhost:3000/signup http://localhost:3000/dashboard " # Or use current proxy agnt snapshot baseline --proxy dev --pages "/, /about, /contact" ``` **What happens:** - Captures screenshots of specified pages - Stores in `~/.agnt/baselines/{name}/` - Saves metadata (URL, viewport, timestamp, git commit) - Returns snapshot ID for later reference ### 2. Make Changes Developer or AI agent makes code changes, refactors, upgrades dependencies, etc. ### 3. After Changes - Compare ```bash # Compare current state to baseline agnt snapshot compare --baseline "before-header-refactor" # Or auto-compare to last baseline agnt snapshot compare --last # With specific viewports agnt snapshot compare --baseline "before-auth-update" --viewports "mobile,tablet,desktop" ``` **What happens:** - Captures new screenshots of same pages - Performs pixel-level diff - Generates diff images highlighting changes - Sends to Claude with vision API for analysis - Returns intelligent report ### 4. Claude Analysis Output ``` Visual Regression Report: before-header-refactor → current ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 📄 Page: / (Home) ✓ Header height: 80px → 64px (expected - header refactor) ✓ Logo position: adjusted (expected - mentioned in task) ⚠️ Navigation spacing changed slightly (minor) 📄 Page: /about ✓ No visual changes detected 📄 Page: /contact ❌ Footer position: moved down 50px (UNEXPECTED) ❌ Form submit button: partially cut off (UNEXPECTED) ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Summary: 2 unexpected changes found View diffs: ~/.agnt/diffs/before-header-refactor/ ``` ## Architecture ### Storage Structure ``` ~/.agnt/ ├── baselines/ │ ├── before-header-refactor/ │ │ ├── metadata.json │ │ ├── home.png │ │ ├── about.png │ │ └── contact.png │ └── before-auth-update/ │ └── ... ├── diffs/ │ └── before-header-refactor-20251213/ │ ├── home-diff.png │ ├── home-baseline.png │ ├── home-current.png │ └── report.json └── config.json ``` ### Metadata Format ```json { "name": "before-header-refactor", "timestamp": "2025-12-13T22:30:00Z", "git_commit": "a645d91", "git_branch": "feature/header-refactor", "pages": [ { "url": "http://localhost:3000/", "viewport": { "width": 1920, "height": 1080 }, "screenshot": "home.png", "dom_snapshot": "home.html", "computed_styles": "home.css.json" } ], "config": { "diff_threshold": 0.01, "ignore_regions": [ { "selector": ".timestamp", "reason": "Dynamic content" }, { "selector": ".ad-banner", "reason": "Third-party" } ] } } ``` ## Implementation Components ### 1. New MCP Tool: `snapshot` ```javascript { name: "snapshot", description: "Capture and compare visual snapshots for regression testing", inputSchema: { action: "baseline" | "compare" | "list" | "delete", name: "string", // Baseline name baseline: "string", // For compare action urls: "string[]", // Pages to capture proxy_id: "string", // Use existing proxy viewports: "string[]", // mobile, tablet, desktop ignore: "string[]", // CSS selectors to ignore threshold: "number" // Diff sensitivity (0-1) } } ``` ### 2. Backend Components **`internal/snapshot/manager.go`** - Manage baseline storage - Coordinate capture process - Perform diff operations **`internal/snapshot/capture.go`** - Use html2canvas (existing) or headless browser - Capture screenshots via proxy - Save DOM snapshots **`internal/snapshot/differ.go`** - Pixel-level comparison using image library - Generate diff images - Calculate change percentage - Ignore specified regions **`internal/snapshot/analyzer.go`** - Send images to Claude vision API - Provide context about expected changes - Parse AI response into structured report ### 3. Image Diffing Library Options: - **pixelmatch** (JS) - Fast pixel diffing - **go-diff** (Go) - Native Go image comparison - **playwright** - Built-in screenshot comparison ### 4. Claude Vision Integration ```go func AnalyzeDiff(baseline, current, diff image.Image, context string) (*Report, error) { prompt := fmt.Sprintf(` Compare these before/after screenshots of a web application. Context: %s Expected changes: - [List from commit message or task description] Analyze: 1. Visual differences between baseline and current 2. Which changes were expected vs unexpected 3. Any regressions or bugs introduced 4. Layout issues, cut-off elements, overlaps Return structured report. `, context) // Send to Claude with vision response := claude.AnalyzeImages(prompt, baseline, current, diff) return parseReport(response) } ``` ## Advanced Features ### Ignore Dynamic Content ```json { "ignore_regions": [ { "selector": ".timestamp", "reason": "Dynamic date/time" }, { "selector": ".user-avatar", "reason": "User-specific" }, { "selector": "[data-testid='ad']", "reason": "Third-party ads" } ] } ``` ### Multiple Viewports ```bash agnt snapshot baseline --viewports " mobile: 375x667, tablet: 768x1024, desktop: 1920x1080 " ``` ### Git Integration ```bash # Auto-baseline on feature branch creation git checkout -b feature/new-header agnt snapshot baseline --auto --git # Auto-compare before committing git commit -m "Update header" # → Triggers: agnt snapshot compare --baseline feature/new-header ``` ### CI/CD Integration ```yaml # .github/workflows/visual-regression.yml - name: Visual Regression Test run: | agnt snapshot compare --baseline main if [ $? -ne 0 ]; then echo "Visual regressions detected!" exit 1 fi ``` ### Interactive Approval ```bash agnt snapshot compare --interactive # Shows each diff: # ┌─────────────────────────────────────┐ # │ home.png - Header changed │ # │ [Baseline] [Current] [Diff] │ # │ │ # │ Expected change? [y/n/skip]: │ # └─────────────────────────────────────┘ # Approved diffs become new baseline ``` ## Benefits for AI-Assisted Development ### Scenario 1: Refactoring ``` Human: "Refactor the header component to use flexbox" Claude: *refactors code* Claude: "Let me verify only expected changes occurred..." Claude: agnt snapshot compare --baseline before-refactor Claude: "✓ Changes look good! Only header layout changed as expected." ``` ### Scenario 2: Dependency Upgrade ``` Human: "Upgrade React from 18.2 to 18.3" Claude: agnt snapshot baseline --name before-react-upgrade Claude: *upgrades package.json, runs build* Claude: agnt snapshot compare --baseline before-react-upgrade Claude: "⚠️ Warning: Button styling changed unexpectedly. React 18.3 may have different default styles." ``` ### Scenario 3: CSS Changes ``` Human: "Make the login button bigger" Claude: *updates CSS* Claude: agnt snapshot compare --last Claude: "❌ Problem detected! The bigger button is now pushing the footer off-screen on mobile. Let me fix that..." ``` ## Implementation Priority ### Phase 1: MVP (P0) - ~2-3 days - [x] Basic baseline capture via existing screenshot - [x] Simple pixel diff (Go image library) - [x] File storage in ~/.agnt/baselines/ - [x] MCP tool integration - [x] Basic diff report ### Phase 2: Intelligence (P1) - ~2-3 days - [x] Claude vision API integration - [x] Intelligent diff analysis - [x] Expected vs unexpected changes - [x] Structured report generation ### Phase 3: Polish (P2) - ~1-2 days - [ ] Multiple viewports - [ ] Ignore regions - [ ] Git integration - [ ] Interactive approval ### Phase 4: Advanced (P3) - Future - [ ] CI/CD runners - [ ] Headless browser option - [ ] DOM diffing (not just pixels) - [ ] Performance regression detection - [ ] A/B comparison mode ## API Examples ### Programmatic Usage ```javascript // In browser __devtool.snapshot.capture('before-changes') // Later __devtool.snapshot.compare('before-changes', { threshold: 0.01, ignore: ['.timestamp', '.ad'], viewports: ['mobile', 'desktop'] }) ``` ### MCP Tool Usage ```json { "tool": "snapshot", "input": { "action": "baseline", "name": "before-upgrade", "proxy_id": "dev", "pages": ["/", "/about", "/contact"] } } ``` ## Configuration ### Global Config (~/.agnt/config.json) ```json { "snapshot": { "storage_path": "~/.agnt/baselines", "diff_threshold": 0.01, "max_baselines": 10, "claude_api_key": "sk-...", "default_viewports": { "mobile": { "width": 375, "height": 667 }, "tablet": { "width": 768, "height": 1024 }, "desktop": { "width": 1920, "height": 1080 } }, "auto_git_baseline": false, "ci_mode": false } } ``` ### Project Config (.agnt.yml) ```yaml snapshot: pages: - url: / name: home - url: /login name: auth - url: /dashboard name: dashboard ignore: - selector: .timestamp reason: Dynamic content - selector: .user-avatar reason: User-specific git_hooks: pre_commit: compare post_checkout: baseline ``` ## Success Metrics - ✅ Catches 90%+ of visual regressions - ✅ <5 second capture time per page - ✅ <10 second analysis time with Claude - ✅ <5% false positive rate - ✅ Zero configuration for basic usage - ✅ Works in CI/CD pipelines ## Notes - Use existing html2canvas for screenshots (already in codebase) - Leverage existing proxy infrastructure - Store baselines locally (no cloud dependency for MVP) - Claude vision API calls are optional (can do pixel diff only) - Compatible with existing agnt workflow

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/standardbeagle/agnt'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

VISUAL_REGRESSION_SPEC.md•10.4 KiB