Skip to main content
Glama

verify_plan

Check each critical point in a QA plan against evidence to produce a structured checklist with per-point satisfaction and an overall status of passed, incomplete, or failed.

Instructions

v0.9.1 (extended v0.9.2 with auto-discovery) — Walk a plan's critical points and check each against evidence. Pairs with qa_plan — must be called with the plan_id returned by a prior qa_plan call. Returns a structured checklist with per-CP satisfaction + an overall status (passed / incomplete / failed).

Matching rule: a CP is satisfied when its verification_hint appears (case-insensitively, as a substring) in any evidence item's stringified form. Evidence items may be strings, dicts, or nested structures — the matcher flattens them.

v0.9.2 — auto_discover mode: set auto_discover: true and the verifier reads the project's pytest-json-report at <QA_PROJECT_ROOT>/report.json (or MK_QA_REPORT_PATH, or the report_path arg) and adds its tests list to the evidence stream. Best-effort — missing or malformed report is silently skipped, NOT a hard error. The response's evidence_sources field reports what was used.

status semantics:

  • 'passed': every CP satisfied

  • 'incomplete': some satisfied, some not

  • 'failed': zero CPs satisfied (or empty evidence)

Even if the host claims 'all good', verify_plan returns 'incomplete' when any CP is unsatisfied. That's the design — ground truth wins over capability claims.

v0.9.3 — When persistence is enabled (see qa_plan), an in-memory cache miss transparently falls back to disk. The response's plan_source field reports where the plan came from: 'memory' (cache hit) or 'disk' (loaded from <plans_dir>/<plan_id>.json after a restart / eviction).

Returns: {plan_id, task, kind, status, checklist[{id, description, verification_hint, satisfied, matched_evidence}], unmet[], summary{total, satisfied, unsatisfied}, evidence_sources{explicit_count, autodiscovered, autodiscovered_count, report_path}, plan_source ('memory' or 'disk'), verified_at}.

Error shapes: no_plan_id / plan_not_found / no_evidence (only when both explicit evidence AND auto_discover are omitted) / bad_evidence.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
plan_idYesRequired. The plan_id returned by qa_plan.
evidenceNoOptional when `auto_discover: true`. Each item is searched for each CP's verification_hint. Pass structured payloads — test result rows from `get_test_report`, scan findings from `run_api_security_scan`, log lines, screenshot paths, etc.
auto_discoverNov0.9.2 — When true, read the project's pytest-json-report and add its `tests` array to the evidence stream. Useful for verifying a CP set against the most recent test run without manually copying report rows into the call.
report_pathNov0.9.2 — Override the report.json location when auto_discover is true. Defaults to `MK_QA_REPORT_PATH` env, then `<QA_PROJECT_ROOT>/report.json`, then `./report.json`.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavioral traits: matching rule (case-insensitive substring), flattening of evidence, auto_discover behavior (best-effort, silent skip), status semantics, persistence cache, and error shapes. Comprehensive.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with layered detail: main purpose first, then matching rule, then version updates. Every sentence provides necessary information without verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Completely describes return shape with field details, error shapes, and behavioral nuances. Without an output schema, the description fully compensates.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds significant context: matching rule for evidence, auto_discover details, report_path fallback logic. Adds value beyond schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it verifies a plan's critical points against evidence, names the sibling `qa_plan` for pairing, and specifies the return type. It distinguishes itself from other tools by focusing on verification of plans.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly mandates calling with `plan_id` from a prior `qa_plan` call and describes auto_discover usage. Does not explicitly state when not to use, but the pairing guidance is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kao273183/mk-qa-master'

If you have feedback or need assistance with the MCP directory API, please join our Discord server