Skip to main content
Glama

quality.evaluate

Read-onlyIdempotent

Evaluate web design quality by scoring originality, craftsmanship, and contextuality. Detects AI cliches, audits accessibility compliance, tests responsive layouts, and provides improvement recommendations.

Instructions

Evaluate web design quality on 3 axes (originality, craftsmanship, contextuality) with AI cliche detection

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pageIdNoWebPage ID (UUID, from DB)
htmlNoHTML content (direct, max 10MB)
weightsNoAxis weights (sum 1.0)
targetIndustryNoTarget industry (e.g. healthcare, finance, technology)
targetAudienceNoTarget audience (e.g. enterprise, consumer, professionals)
includeRecommendationsNoInclude recommendations (default: true)
strictNoStrict mode: stricter AI cliche detection (default: false)
patternComparisonNoPattern comparison options for pattern-driven evaluation (v0.1.0)
contextNoEvaluation context (v0.1.0)
use_playwrightNoUse Playwright for runtime aXe accessibility testing (default: false, uses JSDOM)
responsive_evaluationNoResponsive quality evaluation using Playwright (v0.1.0). Measures touch targets, readability, overflow, and responsive images across viewports.
summaryNoLightweight mode: exclude detailed info and return summary only (v0.1.0 MCP-RESP-01, v0.1.0 default true). When true (default): recommendations max 3, contextualRecommendations max 3, patternAnalysis arrays max 3, axeAccessibility.violations max 5, clicheDetection.patterns max 3. Set to false for full details.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and idempotentHint=true, establishing the safety profile. The description adds context about the evaluation methodology (three axes, cliche detection) but omits significant behavioral details like Playwright vs JSDOM execution, responsive evaluation capabilities, pattern comparison features, and the summary mode's truncation behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The single sentence is information-dense with zero waste, clearly listing the three evaluation axes and cliche detection feature. However, given the tool's complexity (12 parameters, nested objects, multiple evaluation modes), it may be excessively concise rather than appropriately sized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 12 parameters including responsive evaluation, accessibility testing via Playwright, pattern comparison, and contextual analysis capabilities, the description is incomplete. It mentions only the core 3-axis evaluation and cliche detection, missing major functional areas that would help an agent understand the full scope.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description implicitly references the 'weights' parameter via the three axes mention but provides no additional parameter guidance, syntax examples, or explanations beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool evaluates web design quality using three specific axes (originality, craftsmanship, contextuality) and mentions AI cliche detection. However, it fails to distinguish this from siblings like accessibility.audit, design.compare, or page.analyze, which could confuse tool selection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like accessibility.audit or design.compare, nor does it clarify input requirements (e.g., when to use pageId vs html) or prerequisites. No alternatives or exclusions are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/TKMD/reftrix-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server