mk-qa-master

Overview Schema Related Servers Score Discussions

get_failure_details

Retrieve root-cause analysis for failed tests, including traces, screenshots, and videos, to diagnose why each test broke.

Instructions

Extract full root-cause-analysis materials for every failed test in the most recent run.

Behavior:

Reads report.json, filters tests where outcome == 「failed」
pytest: parses Playwright trace.zip → extracts real API call sequence (Frame., Page., Locator., ElementHandle. events) as steps[]
Maestro: parses flow YAML for takeScreenshot: directives → resolves .png at PROJECT_ROOT root
Best-effort resolves screenshot / trace.zip / video / recording paths from --output / --debug-output artifact directories Returns: list[{nodeid, title, message, duration, steps[], screenshot, trace, video}]

When to use:

run_tests just reported failed > 0 → drill into each case
User asks 「why did it fail / show me the trace / what broke」
Filing a JIRA bug → use the artifact paths to attach screenshot+trace
Comparing failure signatures across runs (pair with get_test_history)

When NOT to use:

Want the summary count only → use get_test_report (lighter)
No tests have been run yet → returns [{error: 「找不到報告」}]
Want details for PASSING tests too → not supported here; the HTML reporter renders those via a different path

Edge cases:

test_id substring matches nothing → empty list, no error
screenshot/trace/video missing on disk → those fields are null but the entry stays
Retry-recovered flake (was failed, now passed) → not listed here; surfaces in summary.flaky_in_run instead

Input Schema

TableJSON Schema

Name	Required	Description	Default
`test_id`	No	選填，僅回傳 nodeid 含此關鍵字的 case（substring match，不分大小寫）。省略則回傳全部失敗 case。常用模式：先全部抓→看到特定模式後再用 test_id 收斂。

Tool Definition Quality

A4.9/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses reading report.json, filtering by outcome, parsing behavior for pytest/Maestro, artifact resolution, error handling for missing files, and edge cases. Very transparent about behavior and limitations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (Behavior, Use cases, Non-use cases, Edge cases) and front-loaded. However, it is somewhat lengthy and could be slightly more concise, though every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (multi-framework, multiple artifact types), no annotations, and no output schema, the description covers behavior, parameter details, return format, and edge cases comprehensively. It is complete for an agent to understand and use the tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Only one parameter (test_id) with 100% schema description coverage. The description adds meaning: 'optional, substring match, case-insensitive, omit for all failures, common usage pattern', which goes beyond the schema and helps the agent use it correctly.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool extracts full root-cause-analysis materials for every failed test in the most recent run. It specifies the verb 'extract' and the resource 'full root-cause-analysis materials for failed tests', and differentiates from siblings like get_test_report and get_test_history.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly provides when to use (after run_tests with failures, user asking about failures, filing bugs, comparing across runs) and when NOT to use (want summary only, no tests run, want passing tests). Also covers edge cases, giving comprehensive guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kao273183/mk-qa-master'

If you have feedback or need assistance with the MCP directory API, please join our Discord server