ReftrixMCP

Overview Schema Related Servers Score Discussions

quality.evaluate

Read-onlyIdempotent

Evaluate web design quality by scoring originality, craftsmanship, and contextuality. Detects AI cliches, audits accessibility compliance, tests responsive layouts, and provides improvement recommendations.

Instructions

Evaluate web design quality on 3 axes (originality, craftsmanship, contextuality) with AI cliche detection

Input Schema

TableJSON Schema

Name	Required	Description
`pageId`	No	WebPage ID (UUID, from DB)
`html`	No	HTML content (direct, max 10MB)
`weights`	No	Axis weights (sum 1.0)
`targetIndustry`	No	Target industry (e.g. healthcare, finance, technology)
`targetAudience`	No	Target audience (e.g. enterprise, consumer, professionals)
`includeRecommendations`	No	Include recommendations (default: true)
`strict`	No	Strict mode: stricter AI cliche detection (default: false)
`patternComparison`	No	Pattern comparison options for pattern-driven evaluation (v0.1.0)
`context`	No	Evaluation context (v0.1.0)
`use_playwright`	No	Use Playwright for runtime aXe accessibility testing (default: false, uses JSDOM)
`responsive_evaluation`	No	Responsive quality evaluation using Playwright (v0.1.0). Measures touch targets, readability, overflow, and responsive images across viewports.
`summary`	No	Lightweight mode: exclude detailed info and return summary only (v0.1.0 MCP-RESP-01, v0.1.0 default true). When true (default): recommendations max 3, contextualRecommendations max 3, patternAnalysis arrays max 3, axeAccessibility.violations max 5, clicheDetection.patterns max 3. Set to false for full details.

Tool Definition Quality

B3.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and idempotentHint=true, establishing the safety profile. The description adds context about the evaluation methodology (three axes, cliche detection) but omits significant behavioral details like Playwright vs JSDOM execution, responsive evaluation capabilities, pattern comparison features, and the summary mode's truncation behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The single sentence is information-dense with zero waste, clearly listing the three evaluation axes and cliche detection feature. However, given the tool's complexity (12 parameters, nested objects, multiple evaluation modes), it may be excessively concise rather than appropriately sized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 12 parameters including responsive evaluation, accessibility testing via Playwright, pattern comparison, and contextual analysis capabilities, the description is incomplete. It mentions only the core 3-axis evaluation and cliche detection, missing major functional areas that would help an agent understand the full scope.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description implicitly references the 'weights' parameter via the three axes mention but provides no additional parameter guidance, syntax examples, or explanations beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool evaluates web design quality using three specific axes (originality, craftsmanship, contextuality) and mentions AI cliche detection. However, it fails to distinguish this from siblings like accessibility.audit, design.compare, or page.analyze, which could confuse tool selection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like accessibility.audit or design.compare, nor does it clarify input requirements (e.g., when to use pageId vs html) or prerequisites. No alternatives or exclusions are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Latest Blog Posts

Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security
Open Source Has a Bot Problem
By punkpeye on March 19, 2026.
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/TKMD/reftrix-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server