Skip to main content
Glama

semantic_compare

Compares before and after screenshots to evaluate if UI changes match the intended design request, using semantic analysis beyond pixel diffing.

Instructions

AI-powered visual comparison. Captures before/after screenshots and provides a structured methodology for Claude to semantically evaluate whether UI changes match the intended design request. Goes beyond pixel diffing to understand intent.

Returns both screenshots as images, a pixel-level diff image, the difference percentage, and a detailed semantic methodology prompt. Claude's vision analyzes the screenshots to determine if the changes match what was requested, checking for regressions and unintended side effects.

This tool is FREE — it runs entirely within Claude Code using the user's existing plan. No API keys needed.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlBeforeYesURL of the 'before' state (e.g., http://localhost:3000)
urlAfterYesURL of the 'after' state (e.g., http://localhost:3001)
changeDescriptionYesWhat was the intended change? (e.g., 'Changed the hero section background to a gradient and increased heading font size')
widthNoViewport width in pixels
heightNoViewport height in pixels
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses that it is free, runs within Claude Code, requires no API keys, and returns screenshots, diff image, percentage, and a semantic prompt. It explains the use of Claude's vision. Missing details on error handling or prerequisites, but covers key behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two short paragraphs: first explains purpose and differentiation, second details return values and pricing. Every sentence adds value—no redundancy, tautology, or fluff. Front-loaded with purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains return values, methodology, and cost model, which is good for a tool with no output schema. It lacks details on error handling (e.g., invalid URLs) or prerequisites (e.g., running servers), which are minor gaps. Overall, it provides sufficient context for usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents each parameter's meaning. The description adds context for the overall workflow (e.g., changeDescription is for the intended change) but does not enhance individual parameter semantics beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it does 'AI-powered visual comparison' for semantic evaluation of UI changes, differentiating from pixel diffing. It establishes a specific verb ('compare' implicitly) and resource (before/after screenshots with change description). Among siblings like compare_screenshots, compare_sites, and compare_to_baseline, it uniquely focuses on semantic intent.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description clearly implies when to use—'to understand intent' and 'determine if changes match what was requested'—but does not explicitly say when not to use or name alternatives. However, the context is clear enough for an agent to choose this over pixel-based siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/prembobby39-gif/uimax-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server