run_tests
Execute test suites under the QA runner and generate structured reports with exit codes, logs, and flaky detection. Supports optional filter to target specific tests across pytest, Jest, Cypress, Go, and Maestro.
Instructions
Execute the test suite under the active QA_RUNNER and produce a structured report. The single most-called tool — invoke whenever a user says 「跑/run/test/check/驗證/執行」, after generate_test (verify new test), or after a fix (confirm bug gone).
Behavior:
Invokes the runner's native CLI under QA_PROJECT_ROOT — pytest with --screenshot=on / --tracing=on / --video=retain-on-failure, or
npx jest --json,npx cypress run --reporter json,go test -json,maestro test --format junitOptional
filternarrows the scope: pytest -k expr, jest -t pattern, cypress --spec glob, go -run regex, maestro flow-name substringWrites report.json (pytest-json-report shape, runner-agnostic) + JUnit XML
Snapshots the run into history/ and auto-triggers optimizer.write_plan() → optimization-plan.md is refreshed
Maestro: auto-retries flows that failed on first attempt (MAESTRO_RETRY=true), surfaces flaky_in_run count Returns: {exit_code, raw_exit_code, stdout_tail, stderr_tail, retry_enabled, flaky_in_run, ...}
When to use:
After writing a new test → verify it actually passes
Smoke before a release
Whenever the user prompt contains a run/test verb
When NOT to use:
Inspecting last results without re-running → use get_test_report (cheaper)
Re-running only failed cases → use run_failed (way faster)
Enumerating which tests exist → use list_tests
Edge cases:
No tests match
filter→ exit_code != 0 with 「no tests ran」 in stderr_tailQA_TIMEOUT_SECONDS exceeded → exit_code 124 +
[TIMEOUT…]tag in stderr_tailfilterstarting with-or containing..→ blocked by security guardrail, returns {error: …}
Plan bookend (v0.10.0): pass plan_id from a prior qa_plan call and the response auto-attaches plan_verification — the critical points are checked against the just-written report.json via the same flow run_api_security_scan uses. Omit plan_id to keep the legacy shape (no plan_verification key). When verify_plan fails (unknown / expired plan_id), the run still succeeds; the error envelope is surfaced under plan_verification.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filter | No | 選填,測試名稱關鍵字。pytest 走 -k 表達式(支援 and/or/not)、Jest 走 -t、Cypress 走 --spec '**/*<filter>*'、Go 走 -run regex、Maestro 在 flow 檔名作子字串比對。 | |
| headed | No | 選填,僅對 pytest-playwright 有效。True 時瀏覽器有 UI 模式跑(適合 debug、看 flake 視覺現象);預設 headless 跑、CI / 大量套件用這個。 | |
| browser | No | 選填,僅對 pytest-playwright 有效,指定 Playwright 啟用的 browser engine。需事先 `playwright install <browser>` 過。 | chromium |
| plan_id | No | 選填,v0.10.0+。Plan id returned by qa_plan. When supplied, the response gains a `plan_verification` envelope that checks every critical point against the just-written report.json. Same shape as run_api_security_scan's plan bookend. |