Schema | mk-qa-master

mk-qa-master

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description
`JIRA_EMAIL`	No	Required if SPEC_SOURCE=jira.
`FIGMA_TOKEN`	No	Required if SPEC_SOURCE=figma.
`SPEC_SOURCE`	Yes	Adapter to use (markdown_local / github_issues / linear / jira / notion / figma)
`GITHUB_TOKEN`	No	Optional. GitHub token for github_issues adapter (falls back to gh CLI if unset).
`NOTION_TOKEN`	No	Required if SPEC_SOURCE=notion.
`JIRA_BASE_URL`	No	Required if SPEC_SOURCE=jira (e.g., https://your-domain.atlassian.net).
`JIRA_API_TOKEN`	No	Required if SPEC_SOURCE=jira.
`LINEAR_API_KEY`	No	Required if SPEC_SOURCE=linear.
`SPEC_PROJECT_KEY`	No	Optional. Team key (Linear), project key (JIRA), database ID (Notion), file key (Figma).
`SPEC_PROJECT_ROOT`	Yes	Root path; traceability index stored at SPEC_PROJECT_ROOT/.mk-spec-master/index.json

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
get_runner_infoA	回傳目前由 QA_RUNNER 環境變數選定的測試 runner（pytest / jest / cypress / go / maestro 五選一）加上 server 編譯時內建的全部 runner 清單。建議每個 session 第一個呼叫——AI 用它判斷後續該產 Playwright .py 還是 Maestro .yaml、要不要 headed browser，避免後面拿錯模板。也用來確認專案環境設定正確：QA_PROJECT_ROOT 指對地方、QA_RUNNER 沒拼錯。回傳 shape：{active: 'pytest', available: ['pytest', 'jest', ...]}。
list_testsA	用 runner 的原生 collection 機制列出受測專案內所有可執行測試：pytest 走 `pytest --collect-only`、Jest 走 `npx jest --listTests`、Cypress 走 `cypress/e2e/.cy.` glob、Go 走 `go test -list .`、Maestro 走 `.yaml` 遞迴掃。回傳一份逐行 nodeid / 檔名清單。用法：run_tests 前確認 collection 沒漏、generate_test 前避免跟既有 case 重複。
run_testsA	Execute the test suite under the active QA_RUNNER and produce a structured report. The single most-called tool — invoke whenever a user says 「跑/run/test/check/驗證/執行」, after generate_test (verify new test), or after a fix (confirm bug gone). Behavior: Invokes the runner's native CLI under QA_PROJECT_ROOT — pytest with --screenshot=on / --tracing=on / --video=retain-on-failure, or `npx jest --json`, `npx cypress run --reporter json`, `go test -json`, `maestro test --format junit` Optional `filter` narrows the scope: pytest -k expr, jest -t pattern, cypress --spec glob, go -run regex, maestro flow-name substring Writes report.json (pytest-json-report shape, runner-agnostic) + JUnit XML Snapshots the run into history/ and auto-triggers optimizer.write_plan() → optimization-plan.md is refreshed Maestro: auto-retries flows that failed on first attempt (MAESTRO_RETRY=true), surfaces flaky_in_run count Returns: {exit_code, raw_exit_code, stdout_tail, stderr_tail, retry_enabled, flaky_in_run, ...} When to use: After writing a new test → verify it actually passes Smoke before a release Whenever the user prompt contains a run/test verb When NOT to use: Inspecting last results without re-running → use get_test_report (cheaper) Re-running only failed cases → use run_failed (way faster) Enumerating which tests exist → use list_tests Edge cases: No tests match `filter` → exit_code != 0 with 「no tests ran」 in stderr_tail QA_TIMEOUT_SECONDS exceeded → exit_code 124 + `[TIMEOUT…]` tag in stderr_tail `filter` starting with `-` or containing `..` → blocked by security guardrail, returns {error: …}
run_failedA	只重跑上次失敗的測試——比跑整套套件快很多，適合修完一個 bug 後驗證迭代。pytest 走 `--lf`（last-failed）、Jest 走 `--onlyFailures`、Cypress 解析上次 report.json 的 failures[] 反查 spec 重跑、Go 撈失敗的 Test 名組成 regex 餵 -run、Maestro 反查 nodeid 對應 .yaml 重跑。需要先有過一次 run_tests（不然 report.json 不存在）。回傳 shape 跟 run_tests 一樣，接 get_test_report / get_failure_details 同樣方式檢視。
get_test_reportA	讀上一次 run_tests 留下的 report.json，回傳一個輕量摘要：total / passed / failed / skipped / flaky_in_run（auto-retry 救回的數量）/ duration（秒）。比再跑一次 suite 便宜得多——適合在連續操作中間反覆查狀態。未跑過時回 {error: 找不到報告，請先執行 run_tests}。拿到摘要後若 failed > 0，接 get_failure_details 拿錯誤細節。
get_failure_detailsA	Extract full root-cause-analysis materials for every failed test in the most recent run. Behavior: Reads report.json, filters tests where outcome == 「failed」 pytest: parses Playwright trace.zip → extracts real API call sequence (Frame., Page., Locator., ElementHandle. events) as steps[] Maestro: parses flow YAML for `takeScreenshot:` directives → resolves .png at PROJECT_ROOT root Best-effort resolves screenshot / trace.zip / video / recording paths from --output / --debug-output artifact directories Returns: list[{nodeid, title, message, duration, steps[], screenshot, trace, video}] When to use: run_tests just reported failed > 0 → drill into each case User asks 「why did it fail / show me the trace / what broke」 Filing a JIRA bug → use the artifact paths to attach screenshot+trace Comparing failure signatures across runs (pair with get_test_history) When NOT to use: Want the summary count only → use get_test_report (lighter) No tests have been run yet → returns [{error: 「找不到報告」}] Want details for PASSING tests too → not supported here; the HTML reporter renders those via a different path Edge cases: test_id substring matches nothing → empty list, no error screenshot/trace/video missing on disk → those fields are null but the entry stays Retry-recovered flake (was failed, now passed) → not listed here; surfaces in summary.flaky_in_run instead
generate_testA	產生 pytest-playwright 測試骨架。推薦流程：先呼叫 analyze_url 拿 candidate_tcs，再對每條想覆蓋的 TC 呼叫一次 generate_test、把該 candidate_tc 整段字串當 description 傳入 — 這段會自動寫成 test 函式的 docstring，HTML 報告會把它當作 case 名稱顯示。若提供 url+module（來自 analyze_url 的 modules[]），會用 selectors 預填可執行版本。若想一次處理整個 URL、不想自己編排，請改用 auto_generate_tests。
codegenA	Launch interactive test recording for the active runner. Useful as a baseline-builder before refining with generate_test. Behavior: pytest-playwright: spawns `playwright codegen -o <output> <url>` — a real Chromium window opens, you click / type / navigate, Playwright transcribes every action into runnable pytest code, output is saved to PROJECT_ROOT/ on browser close Maestro: returns a human-readable hint string pointing at `maestro studio` (no shell-able codegen exists for it) jest / cypress / go runners: same Maestro-style fallback hint Returns: a string with the saved path or the manual-record hint. When to use: Building a baseline happy-path test interactively (you click, it transcribes) Site has complex auth / JS state you'd rather not script by hand Quick prototype before refining with generate_test User says 「record / 錄製 / use codegen / 紀錄操作」 When NOT to use: Headless CI / container environments → can't open Chromium Need structured, AI-driven test generation from analysis → use generate_test or auto_generate_tests instead One-shot per-module test coverage → use auto_generate_tests Mobile UI flows → returns a hint anyway, consider analyze_screen + generate_test instead Edge cases: `output` contains `..` or is absolute → blocked by security guardrail Chromium not installed → playwright codegen fails; user sees the `playwright install` hint in stderr
generate_html_reportB	把最近一次 run_tests 的結果渲染成單檔自包含 HTML——base64 內嵌截圖、嵌入式 step list、history sparkline 走勢、折疊的 Passed 區塊、展開的 Failed cards。沒外部 CSS/JS 依賴，可以直接寄信、丟靜態 host、貼到 Slack。預設輸出 PROJECT_ROOT/report.html。實作位於 reporters/html.py，走 sample_report.html 同款設計。
get_test_historyA	遍歷 test-results/history/*.json 快照（每次 run_tests 完會自動歸檔），回傳逐次摘要：timestamp / total / passed / failed / skipped / duration / pass_rate(0-100)。用於 flake 分析（『這條測試上週一直 fail 嗎』）、速度退化分析（『duration 是不是越來越長』）、覆蓋趨勢圖。預設回最近 10 次，limit 可調 1-100。想要可執行行動建議的話接 get_optimization_plan，它已綜合 history + telemetry。
get_optimization_planA	綜合 history/ 快照、telemetry tool-usage、analyze_url 偵測過的 modules，產出三層自我強化分析：(1) 測試套件品質：每條 test 算 outcomes 字串（PFPFP 那種）→ flake_score、再對失敗 error signature 做指紋比對，連 3 次相同 signature 升級為 broken，duration 退化超 1.5x 標記 slow_regression，否則 stable_passing；(2) MCP 使用模式：top tool、重複 args、錯誤率、常見呼叫鏈（A→B 共現）；(3) AI 產測效益：generate_test 寫的 test 有沒有出現在下一次 run、analyze_url 偵測到的 module 對不對得到 test 檔（採用率 vs 覆蓋缺口）。回傳結構化 JSON 並同步寫進 PROJECT_ROOT/optimization-plan.md。每次 run_tests 結束會自動 trigger 一次、所以這個 tool 用來「即時讀」結果。
analyze_urlA	Probe a live web page in headless Chromium and return a structured map of testable modules plus the API endpoints the page actually called. The web counterpart of analyze_screen. Behavior: page.goto(url) with DOMContentLoaded + 5s networkidle wait DOM probe extracts five module kinds: form (with fields[] + required flags), nav (link lists), dialog (modal containers), section (labeled regions), cta (action buttons matching action keywords like 登入/送出/ Login/Submit) Each module gets a candidate_tcs[] — domain-aware test case strings ready to paste into generate_test Records every fetch/XHR the page issues, dedupes by (method, path), adds endpoint-specific candidate TCs (401, 404, 4xx, payload-too-large…) Layout overflow scan flags visible elements whose content escapes its container by >2 px horizontal / >10 px vertical (跑版 / text-overflow) Returns: {url, page_title, scanned_at, modules[], api_endpoints[], layout_warnings[]} When to use: User wants tests for a specific URL or page Designing regression coverage from real user-facing behavior Need backend API coverage hints (api_endpoints[] gives methods + paths) Investigating layout bugs at the current viewport Pair with generate_test(module=…) for one runnable test per module When NOT to use: Mobile apps (no DOM) → use analyze_screen Want analysis + immediate test generation → use auto_generate_tests (one-shot version) Looking for existing tests → use list_tests Single-page testing prototype → use codegen instead Edge cases: URL unreachable / timeout → returns {error: 「打開頁面失敗…」, url} Page has 0 forms / 0 ctas → modules[] is empty but the call succeeds Login-walled URL with no auth_cookie → analyzes the login page (less useful) — pass auth_cookie to reach post-login pages SPA with delayed hydration → bump timeout_ms to 30000+
analyze_screenA	Mobile 版的 analyze_url：透過 `maestro hierarchy` dump 當前 iOS Simulator / Android Emulator / 實體機 / BlueStacks（透過 QA_ANDROID_HOST）前景 app 的 view tree，再分類成 form（具 hint_text 的輸入欄位）、cta（enabled + 有文字的可點元件）、tab_bar（selected 狀態 + 同 y 對齊的 2+ 個 tab）三種 modules 並附 candidate_tcs。內建 noise filter 自動排除 iOS 狀態列 + asset 命名標籤（bg_* / *_filled / 純數字 / 單一 ASCII 字元等）讓結果信號集中。需 Maestro CLI 已裝、裝置 booted、app 已在前景。若給 app_id + launch_app=true，會先用 launchApp 啟動再 dump。
init_qa_knowledgeA	在受測專案根 (PROJECT_ROOT) 建立 qa-knowledge.md 起手範本，含業務規則 / 歷史 Bug / 標準斷言文字 / User Journeys / 技術約束 5 個 H2 區段，每段都有 TODO 提示。Idempotent：檔已存在不會覆蓋（除非 overwrite=true）。新用戶建議第一次跑 MCP 就先 call 一次。這份檔案後續會被 get_qa_context 讀、做為 business_context 傳進 generate_test，讓 AI 寫出有業務邏輯的測試（而不是泛例 monkey testing）。
get_qa_contextA	讀取受測專案的 qa-knowledge.md（業務規則 / 歷史 Bug / 標準斷言文字 / User Journeys 等領域知識），用 ## H2 區段拆分。用法：先 call 拿到整份或指定 section，再把相關段落以 business_context 傳給 generate_test，產出的 test 就會自帶業務知識註解 — 跳脫 monkey testing。若檔案不存在會 fallback 到內建的 ISTQB 七大原則 + 等價分割 + 邊界值 + 決策表 + 狀態轉換 + Mobile checklist 通用知識，先用著也可以；之後跑 init_qa_knowledge 建立專案專屬版本。
inspect_visual_challengeA	Detect a reCAPTCHA v2 image-grid challenge on the active page, screenshot it, and return tile metadata. The AI client (Claude / Cursor / Gemini — multimodal) uses its own vision to identify which tiles to click, then calls solve_visual_challenge with the selected indices. Requires QA_VISUAL_CHALLENGE_CONSENT=true at the server level; without it, returns a structured `consent_required` error carrying the full legal disclaimer. Returns: {challenge_id, screenshot_base64, challenge_text, grid_layout ('3x3'\|'4x4'), tile_count, tiles[{index, viewport_x, viewport_y, w, h}], expires_at, fingerprint}. Error shapes: consent_required / unauthorized_domain / forbidden_domain / no_challenge_present / no_active_page / detection_failed — same {error, retryable, hint} envelope as every other runner. Scope: reCAPTCHA v2 image-grid only in v0.7.0 (hCaptcha → v0.7.1; v3 / Turnstile permanently out of scope). Pair with solve_visual_challenge — this tool alone never clicks anything.
solve_visual_challengeA	Apply the AI client's tile selection, execute the click chain, click Verify, wait for the reCAPTCHA token, and return the outcome. Pairs with inspect_visual_challenge — must be called with the `challenge_id` returned by the previous inspect call. Requires `confirm: true` as a safety latch — an accidental call without confirm returns `confirm_required` without clicking anything. Also requires QA_VISUAL_CHALLENGE_CONSENT=true at the server level. DYNAMIC-REPLACE MODE (v0.7.4): when the challenge prompt says 'Click verify once there are none left' (en) / '確定沒有遺漏' (zh), clicked tiles get replaced with new images. solve detects this and returns `status: 'continue'` with a FRESH screenshot + tile grid instead of clicking Verify. The AI should look at the new screenshot and call solve again with the next matches. To finalize (click Verify and check for a token), pass an empty `selected_tile_indices: []`. Returns: {status: 'passed' \| 'continue' \| 'failed' \| 'expired' \| 'consent_required' \| 'confirm_required' \| 'challenge_not_found' \| 'error', challenge_id, attempts_remaining, token (only on passed), hint, plus on 'continue': screenshot_base64, tiles, tile_count, grid_layout, rounds_used}. Telemetry logs the boolean outcome only — no screenshots, no challenge text, no tile selection are ever persisted.
auto_generate_testsA	一鍵交付：在內部依序做 analyze_url → 為每個偵測到的 module 用 candidate_tcs 內容各跑一次 generate_test，把整套 pytest 測試骨架寫進 PROJECT_ROOT/tests/。等同於『analyze_url 後對每個 module 手動跑 N 次 generate_test』的自動化版本，適合「給我一個 URL、其他你看著辦」這種快速覆蓋場景。每條 candidate_tc 變成對應 test 函式的 docstring，run_tests 跑完 HTML 報告會用 docstring 當 case 名稱顯示。回傳產生的檔案路徑列表 + 每個 module 對應幾個 test。預設每個 module 1 條，想要更密的覆蓋拉 tests_per_module。
run_api_security_scanA	v0.8.0: OWASP API Security Top 10 (2023) rule-based scanner. Loads an OpenAPI 3.x spec, walks each path × method, and dispatches v0.8's 5 in-scope rules — BOLA (API1), Broken Authentication (API2), Mass Assignment (API3, opt-in), Function-Level Authz (API5), Security Misconfiguration (API8). Returns a v0.8 security report block with per-finding rule_id, severity (critical/high/medium/low/info), endpoint, evidence dict, and remediation_hint. Requires QA_API_SECURITY_CONSENT=true at the server level. Non-localhost hosts must be in QA_API_SECURITY_AUTHORIZED_DOMAINS (comma-separated). mass_assignment mutates server state — opt in by passing it in `categories`. Tier 1 fixture (`examples/sample_vulnerable_api/`) ships with the package for self-tests. v0.9.4 — Pass `plan_id` (from qa_plan) to auto-verify the scan's findings against the plan's critical points in the same call. The response gains a `plan_verification` block (per-CP checklist + overall status). One-shot equivalent of qa_plan → run_api_security_scan → verify_plan. Returns: {scan_id, spec_url, base_url, categories_run, rules_ran, ops_scanned, severity_threshold, findings[...], summary{total, by_severity}, findings_below_threshold_count, plan_verification (only when plan_id given)}. Error shapes: consent_required / unauthorized_domain / spec_load_failed / no_base_url / unknown_categories / bad_severity_threshold.
qa_planA	v0.9.1 — Store a critical-points checklist before acting on a QA task. The host LLM declares what success looks like (test passes, scan finds X, screenshot shows Y), this tool stores it, returns a `plan_id`. Later, call `verify_plan` with evidence (test result rows, scan findings, log lines, screenshot paths) and get a per-CP pass/fail verdict. Inspired by microsoft/Webwright's plan.md pattern: declaring success criteria up-front makes the verifier honest about whether the work was done. Plans live 30 minutes (cache TTL) in memory and are LRU-bounded at 50 outstanding. v0.9.3 — disk persistence: when QA_PROJECT_ROOT is set (or QA_PLAN_PERSIST=true), the plan is also dumped atomically to <QA_PROJECT_ROOT>/test-results/plans/<plan_id>.json. verify_plan transparently falls back to disk on in-memory misses, so plans survive process restarts and cache eviction. Expiry is still honored on disk reads — a TTL'd plan won't silently reload. Persistence is best-effort: filesystem errors never raise into the caller. Returns: {plan_id (12 hex chars), task, kind, critical_points [{id, description, verification_hint}], created_at, expires_at, persisted_to (filesystem path or null when persistence is off)}. Error shapes: no_task / no_critical_points / bad_critical_points (duplicate id, missing description, wrong type) / bad_kind.
verify_planA	v0.9.1 (extended v0.9.2 with auto-discovery) — Walk a plan's critical points and check each against evidence. Pairs with `qa_plan` — must be called with the plan_id returned by a prior qa_plan call. Returns a structured checklist with per-CP satisfaction + an overall status (passed / incomplete / failed). Matching rule: a CP is satisfied when its verification_hint appears (case-insensitively, as a substring) in any evidence item's stringified form. Evidence items may be strings, dicts, or nested structures — the matcher flattens them. v0.9.2 — auto_discover mode: set `auto_discover: true` and the verifier reads the project's pytest-json-report at `<QA_PROJECT_ROOT>/report.json` (or `MK_QA_REPORT_PATH`, or the `report_path` arg) and adds its `tests` list to the evidence stream. Best-effort — missing or malformed report is silently skipped, NOT a hard error. The response's `evidence_sources` field reports what was used. status semantics: 'passed': every CP satisfied 'incomplete': some satisfied, some not 'failed': zero CPs satisfied (or empty evidence) Even if the host claims 'all good', verify_plan returns 'incomplete' when any CP is unsatisfied. That's the design — ground truth wins over capability claims. v0.9.3 — When persistence is enabled (see qa_plan), an in-memory cache miss transparently falls back to disk. The response's `plan_source` field reports where the plan came from: 'memory' (cache hit) or 'disk' (loaded from <plans_dir>/<plan_id>.json after a restart / eviction). Returns: {plan_id, task, kind, status, checklist[{id, description, verification_hint, satisfied, matched_evidence}], unmet[], summary{total, satisfied, unsatisfied}, evidence_sources{explicit_count, autodiscovered, autodiscovered_count, report_path}, plan_source ('memory' or 'disk'), verified_at}. Error shapes: no_plan_id / plan_not_found / no_evidence (only when both explicit evidence AND auto_discover are omitted) / bad_evidence.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
`Latest Test Report (HTML)`	最近一次測試報告，即時渲染為自包含 HTML
`Latest Test Report (JSON)`	原始 report.json（各 runner 的原生格式）
`Optimization Plan (Markdown)`	自我強化分析：每跑完一次自動產出的下一輪行動清單

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kao273183/mk-qa-master'

If you have feedback or need assistance with the MCP directory API, please join our Discord server