Which integrations are available for this server?

Provides test automation and self-healing fix capabilities for Android apps, including detection of race conditions, memory leaks, and security issues. Analyzes Dart code for issues such as missing super.dispose() calls and provides syntax-validated fixes for Flutter apps. Automates Flutter widget testing and applies fixes such as adding missing dispose() overrides for AnimationController. Provides test automation and self-healing fix capabilities for iOS apps, including detection of retain cycles and closure capture in Swift. Detects and suggests fixes for Kotlin concurrency issues like MutableStateFlow mutated off Dispatchers.Main and Flow collected without proper context. Detects and fixes React-specific issues such as useState setters called after await without mount guards and missing useEffect cleanup. Detects and suggests fixes for Swift concurrency issues like @Published mutation outside @MainActor and concurrent DispatchQueue writes without barrier.

How do I use test-genie-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@test-genie-mcp vibe check my app" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

test-genie-mcp

by MUSE-CODE-SPACE

Overview Schema Related Servers Score Discussions

TypeScript

Local

test-genie-mcp

Built for vibe coders: one command, get a prioritized list of what's actually broken about your project.

Self-healing test automation for iOS, Android, Flutter, React Native and Web apps — as an MCP server.

npm version License: MIT MCP

v3.1.1 — vibe-check + honest auto-fix. One MCP call, ~30 seconds: race conditions + security issues + memory leaks + logic errors + perf smells, prioritized. Stays on your machine, no telemetry. Pass autoFix: true for the small, safe mechanical fixes (weak-hash, simple Math.random assignment) — backup + syntax-validate + rollback-on-syntax-fail. For test-verified application of harder fixes, use v3.0.0's iterate-fix loop.

Vibe coders quickstart

You don't read the docs. You open the project, talk to Claude, and want a verdict. Here it is:

In Claude (with test-genie-mcp installed — setup):

/vibe-check /Users/me/my-app

Claude calls diagnose_project under the hood. ~30 seconds later you see:

# vibe-check report

- Project: /Users/me/my-app
- Platform: web
- Findings: 11 total — 4 critical, 4 high, 1 medium, 1 low
- Estimated fix time: ~85 min

## Top 5 issues

### 1. [CRIT] Hardcoded AWS access key id found in source
- File: `server.js:7`
- Category: security / secret (CWE-798)
- Confidence: 95%
- Fix: Move the value to an env var, gitignore the config, rotate the leaked key.

### 2. [CRIT] SQL string built by concatenating user input
- File: `server.js:21`
- Category: security / injection (CWE-89)
- Fix: Use parameterized queries (`db.query("... WHERE id = ?", [id])`).

### 3. [HIGH] useState setter called after await without mount guard
- File: `UserProfile.tsx:16`
- Category: race-condition / react-setstate-after-await (CWE-362)
- Confidence: 78%
- Fix: Use AbortController and check signal.aborted before calling setters.

… (top 5 shown — full list at output: "detailed")

## Next steps
1. Address the critical / high findings above.
2. Re-run diagnose_project after fixing to confirm convergence.
3. Use run_iterative_fix_loop for test-driven verification of each fix.

If any finding is autoFixable: true and is at high/critical severity, the diagnose_project call accepts autoFix: true to apply the mechanical replacement directly (with backup + syntax validation — see SAFETY.md for the exact guards). The v3.1.1 honest scope is narrow: weak hash (createHash('md5'|'sha1') → createHash('sha256')) and standalone Math.random() in security-sensitive files. For broader/structural fixes (race conditions, eval, exec injection) run run_iterative_fix_loop separately — it re-runs tests and auto-rolls-back on regression.

Related MCP server: mcp-lab-agent

Why test-genie?

The bottleneck in mobile + cross-platform test automation isn't writing tests — it's the loop between a failing test and a passing test. test-genie closes that loop:

failing test → analyzer flags issue → fix proposed → dry-run + syntax check →
applied with backup → affected tests re-run → regression check → loop or stop

This full loop is the run_iterative_fix_loop tool. The diagnose_project autoFix: true path in v3.1.1 covers a strict subset — backup + dry-run + syntax-validate + apply, without re-running tests (so no test-regression rollback in that path). Use the right tool for the job — and see SAFETY.md for the exact guards on each.

Other tools (Detox, Maestro, Playwright, xcodebuild test) run tests. test-genie runs tests and drives the fix until the bar is met or it can no longer make progress — without you scrubbing through stack traces.

5-minute Quickstart

# 1. Install
npm install -g test-genie-mcp

# 2. Add to Claude Desktop config (~/.config/claude/claude_desktop_config.json)
{
  "mcpServers": {
    "test-genie": {
      "command": "npx",
      "args": ["test-genie-mcp"],
      "env": {
        "TEST_GENIE_ALLOWED_ROOT": "/path/to/your/project"
      }
    }
  }
}

# 3. Restart Claude Desktop. From a chat:
#    "Run the iterate-fix loop on /Users/me/my-rn-app with autoApply=false"

Expected output (truncated):

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Iterative fix loop f8b3… — PAUSED-FOR-CONFIRMATION
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Iterations completed: 1
Fixes applied:        0
Regressions rolled back: 0
Final tests:          7/10 passing (3 failing)

Pending confirmations (3):
  - 71fbe…: Fix: useEffect missing cleanup for setInterval (confidence: 85)
  - 92ad1…: Fix: Force-unwrap on possibly-undefined name (confidence: 85)
  - …

Resume token: f8b3…

Re-call with autoApply: true (or resumeToken: "f8b3…") to actually patch the files.

Real use cases

The flows below describe the run_iterative_fix_loop path (v3.0 headline) — full detect → propose → dry-run → apply-with-backup → re-run-tests → rollback-on-regression. The diagnose_project autoFix path in v3.1.1 is the narrower mechanical-replacement-only path; see SAFETY.md §4 for what that one actually touches.

1. React Native memory-leak self-healing

A team adds setInterval(...) in a useEffect and forgets cleanup. test-genie's detect_memory_leaks flags it, suggest_fixes proposes return () => clearInterval(id) (src/tools/fixing/suggestFixes.ts:169-179), the loop dry-runs the patch through the TS compiler, applies with backup, re-runs only the affected snapshot test, confirms 100% pass, stops. Before: 1 failing snapshot. After: 0 failing, 1 fix applied, 1 backup at .test-genie-backups/.

2. Flutter widget `dispose()` automation

AnimationController left undisposed. test-genie sees the missing dispose() override, generates a Dart @override dispose() { controller.dispose(); super.dispose(); } block (suggestFixes.ts:214-217), runs dart analyze on the patched file, applies, re-runs flutter test, converges.

3. iOS retain-cycle (closure capture)

self.timer = Timer.scheduledTimer(...) { _ in self.tick() } — rule-based detector flags closure self-capture, fixer rewrites to [weak self] _ in guard let self = self else { return }; self.tick() (suggestFixes.ts:239-242). If swiftc is on PATH the syntax check is real; otherwise test-genie reports "downgraded validation" so you know.

How the iterate-fix loop works

┌────────────────────┐
│   collect tests    │  (run_scenario_test / supplied list)
└─────────┬──────────┘
          │
   pass-rate ≥ threshold? ── yes ──▶  SUCCESS
          │ no
          ▼
┌────────────────────┐
│  detect issues     │   memory + logic analyzers
└─────────┬──────────┘
          │
┌────────────────────┐
│  suggest fixes     │   rule-based (default) → LLM (hybrid, optional)
└─────────┬──────────┘
          │
┌────────────────────┐
│  dry-run + syntax  │   TS compiler API / platform compiler / brace check
└─────────┬──────────┘
          │
┌────────────────────┐
│  apply with backup │   per-file `.test-genie-backups/`
└─────────┬──────────┘
          │
┌────────────────────┐
│  re-run tests      │   regression?  yes → auto-rollback
└─────────┬──────────┘
          │
          ▼
   loop (≤ maxIterations, ≤ totalTimeout)

See docs/ITERATE_FIX_LOOP.md for a sequence diagram and the full safety-guard list.

Tools (23)

#	Tool	Mode
1	`analyze_app_structure`	real
2	`generate_scenarios`	real
3	`create_test_plan`	real
4	`run_scenario_test`	hybrid
5	`run_simulation`	simulated
6	`run_stress_test`	hybrid
7	`detect_memory_leaks`	real
8	`detect_logic_errors`	real
9	`suggest_fixes`	real
10	`confirm_fix`	real
11	`apply_fix`	real
12	`rollback_fix`	real
13	`run_full_automation`	hybrid
14	`run_iterative_fix_loop` (v3.0 headline)	hybrid
15	`generate_report`	real
16	`get_pending_fixes`	real
17	`get_test_history`	real
18	`analyze_performance`	real
19	`analyze_code_deep`	real
20	`generate_cicd_config`	real
21	`diagnose_project` (v3.1 headline — vibe-check)	real
22	`detect_race_conditions`	real
23	`detect_security_issues`	real

mode legend in docs/SIMULATION_VS_REAL.md.

Plus 4 resources (test-genie://iteration-logs, …/test-history/{path}, …/iteration-logs/{loopId}, …/applied-fixes/{path}) and 3 prompts (full-test-pipeline, diagnose-failure, vibe-check).

What vibe-check catches

Race conditions (detect_race_conditions / diagnose_project):

Pattern	Language	Severity	Auto-fixable (v3.1.1)
`useState` setter called after `await` without mount guard	TS/JS/React	high	no (structural)
`useEffect` with async fetch, no AbortController/cleanup	TS/JS/React	high	no (structural)
`arr.forEach(async ...)` (silent fire-and-forget)	TS/JS	medium	no (ordering-sensitive)
Adjacent fetches without `Promise.all` / sequencing	TS/JS	medium	no
TOCTOU: `existsSync` then `readFileSync` without lock	TS/JS Node	medium	no
Non-atomic counter increment in async context	TS/JS	low	no
`@Published` mutation outside `@MainActor`	Swift	medium	no
Concurrent `DispatchQueue` writes without `.barrier`	Swift	medium	no
`MutableStateFlow` mutated off `Dispatchers.Main`	Kotlin	medium	no
`Flow` collected without `flowOn`	Kotlin	low	no
Goroutine + shared map without `sync.Mutex`	Go	high	no

v3.1.1 honesty audit: useEffect-no-abort and forEach-await were previously advertised as auto-fixable. They are not — wrapping with AbortController or rewriting to Promise.all(arr.map(...)) changes behavior we can't verify statically. They are now report-only. See SAFETY.md.

Security (detect_security_issues / diagnose_project):

Pattern	Severity	CWE	Auto-fixable (v3.1.1)
Hardcoded AWS / Stripe / GitHub / Google / Slack token	critical / high	CWE-798	no (rotate)
Hardcoded JWT secret literal	high	CWE-798	no
API token in URL query string	high	CWE-200	no
`.env` file present but not gitignored	high	CWE-538	no (rotation must follow)
SQL string concat with `req.params` / `req.body`	critical	CWE-89	no
`innerHTML` / `dangerouslySetInnerHTML` with dynamic value	high	CWE-79	no
`eval()` / `new Function()` with non-literal	critical	CWE-95	no
`Math.random()` in security-sensitive file, standalone assignment	high	CWE-338	yes (`crypto.randomInt`)
`Math.random()` mixed into arithmetic	high	CWE-338	no (semantic)
`createHash('md5'\|'sha1')` in security-keyword file	high	CWE-327	yes (`'sha256'`)
`createHash('md5'\|'sha1')` elsewhere	medium	CWE-327	no (below severity floor)
`child_process.exec` with user-input template literal	critical	CWE-78	no
`fetch(req.query.url)` (SSRF)	high	CWE-918	no
CORS `*` origin + `Allow-Credentials: true`	high	CWE-942	no
Cookie set without `httpOnly` / `secure` / `sameSite`	low	CWE-1004	no
`yaml.load` without safe schema	medium	CWE-502	no

v3.1.1 honesty audit: .env/Math.random (general)/yaml.load were previously advertised as auto-fixable. They were either too risky to rewrite blindly or no strategy shipped — flipped to report-only. See SAFETY.md §5.

What vibe-check misses (honest list)

This is a "catch the obvious stuff in 30s" filter, not Snyk / Semgrep / a full SAST tool. We don't catch:

Cross-file data-flow. If user input flows through three files before reaching a db.query, the regex won't connect the dots. A real SAST traces taint across the call graph. Roadmap: ts-morph reference walking for top-N entry points.
Vulnerable transitive deps. We don't query npm advisories — that's npm audit's job, and bundling a stale advisory list would lie. Run npm audit --json in parallel if you want dep-CVE coverage.
Race conditions across processes. We catch in-process JS / Swift / Kotlin / Go races. Distributed races (lock ordering across services, DB transactions) need different tooling.
Type-correct but logic-broken code. The analyzer is syntactic, not semantic. A Math.random() named getNonce won't fool us; a properly-named crypto.randomBytes used with a tiny entropy budget will.
Custom secret formats. Internal company tokens with unique prefixes need a regex you can add to securityAnalyzer.SECRET_PATTERNS. PR welcome.
Real-time / dynamic issues. Memory leaks under load, network timeouts, slow renders mid-interaction — those need run_stress_test / run_simulation, not static analysis.

If you want deeper coverage on top of vibe-check: feed the findings into run_iterative_fix_loop for test-verified application, or escalate to Snyk / Semgrep / GitHub Advanced Security for compliance use cases.

vibe-check vs alternatives

	vibe-check (test-genie)	Snyk	Semgrep	GitHub Advanced Security
Runs locally	yes	hybrid (cloud)	yes	no (cloud)
Telemetry-free	yes (zero network calls)	no	partial	no
Fix loop integration	yes (`run_iterative_fix_loop`)	no	no	no
Race-condition detection	yes (JS/Swift/Kotlin/Go)	no	partial	partial
Cross-file taint flow	no (roadmap)	yes	yes	yes
Setup time	none (already installed if test-genie is installed)	account + auth	install + ruleset	repo-level enable

If your goal is "before I commit, what's broken?", vibe-check wins on latency. If your goal is "compliance + supply chain audit", use the dedicated tools.

When NOT to use test-genie

Production-gate test runs. test-genie is built for the development feedback loop. For shipping decisions, use a proper CI that you control end-to-end.
Code your team must hand-review every line of. The loop's job is to propose and apply fixes; if every fix needs a human eye, leave autoApply: false (the default) and use it as a fix-proposal generator only.
No backup / no version control situations. test-genie's auto-rollback is best-effort and requires the per-file backup to exist. Always run inside a git working tree.

Comparison

	test-genie	Detox	Maestro	xcodebuild test
Runs E2E / unit tests	✅ (via Jest/Detox/etc.)	✅	✅	✅
Detects code issues	✅ rule + LLM	❌	❌	❌
Iterative fix loop	✅ (`run_iterative_fix_loop`)	❌	❌	❌
Auto-rollback on test regression	✅ inside `run_iterative_fix_loop` only	❌	❌	❌
Auto-rollback on syntax failure	✅ all apply paths	❌	❌	❌
MCP-native (talks to Claude / agents)	✅	❌	❌	❌
Multi-platform	iOS+Android+Web+Flutter+RN	iOS+Android	iOS+Android	iOS only

Scope note: diagnose_project autoFix: true rolls back on syntax-validate failure (applyFix.ts:185-202) but does not re-run tests, so it cannot detect test regressions. For test-driven rollback use run_iterative_fix_loop. See SAFETY.md §2.4.

test-genie uses tools like Jest, Detox, and xcodebuild test under the hood — it sits at the orchestration layer, not the test-runner layer.

Known limitations

Platform syntax check downgrade. For Swift/Kotlin/Java/Dart we try the platform compiler in -typecheck mode. If the compiler isn't on PATH, we fall back to brace-balance validation and surface downgraded: true in the result. Install swiftc / kotlinc / javac / dart for real validation.
LLM is optional and gated. strategy: 'hybrid' only kicks LLM in when rule-based confidence is below threshold. Without an API key the loop is rule-based-only — no failure.
Storage is per-machine. Test history / iteration logs live under $TEST_GENIE_STORAGE_DIR (defaults to ~/.test-genie-mcp). Not synced across machines.
Simulated mode is "simulation," not magic. run_simulation returns plausible anomalies, not real ones. Use run_scenario_test (hybrid) for real-device runs.

Configuration

Env var	Default	Purpose
`TEST_GENIE_ALLOWED_ROOT`	`cwd`	Capability-based path safety — server refuses to read/write outside this root.
`TEST_GENIE_STORAGE_DIR`	`~/.test-genie-mcp`	Where scenarios / results / iteration logs live.
`TEST_GENIE_LLM_PROVIDER`	auto-detect	`anthropic` / `openai` / `none`.
`ANTHROPIC_API_KEY`	—	Used when provider = `anthropic`.
`OPENAI_API_KEY`	—	Used when provider = `openai`.
`TEST_GENIE_ANTHROPIC_MODEL`	`claude-haiku-4-5`	Override Anthropic model.
`TEST_GENIE_OPENAI_MODEL`	`gpt-4o-mini`	Override OpenAI model.

Migrating from v2.x

run_full_automation still works. The confirmMode / autoFix options are kept for compatibility but autoApply: boolean is the new way — autoApply: true is equivalent to confirmMode: 'auto'.
Subprocess hardening means platform tools now reject scheme / device / package-name arguments that contain shell metacharacters. If your CI was passing weird-looking values, sanitize them first.
See CHANGELOG.md for the full breaking-change list + migration recipes.

Roadmap

LLM-based fix-proposal voting (multiple proposals → pick the best by syntax + retest delta)
Multi-repo sync (run the loop across N repos in parallel from one MCP call)
A "watch mode" that runs the loop on file save
Better Detox / Maestro artifact ingestion (link videos into iteration logs)

Contributing

Issues, PRs, and ideas welcome — see CONTRIBUTING.md (TODO). Code lives under src/, tests under tests/. Run npm test before sending a PR.

Maintainer

@MUSE-CODE-SPACE — Yoonkyoung Gong.

License

MIT — see LICENSE.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

0dRelease cycle

3Releases (12mo)

Commit activity

Resources

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

View all tools

Related MCP Servers

AWT (AI Watch Tester)
Testing & QA Tools Autonomous Agents AI & Machine Learning
ksgisang
A
license
-
quality
D
maintenance
AI-powered E2E testing MCP server. Point at a URL — AI generates test scenarios, runs Playwright tests, and self-heals failures automatically. Works on Canvas and Flutter Web apps.
Last updated 2026-04-15
7
MIT
mcp-lab-agent
Testing & QA Tools Autonomous Agents Developer Tools
Wesley-Gomes93
A
license
B
quality
D
maintenance
Autonomous QA testing MCP server that analyzes, fixes, and learns from test failures. Integrates with IDE and Slack to provide cause and fix in plain language.
Last updated 2026-08-01
31
12
MIT
ATR Healer
Testing & QA Tools Code Analysis Autonomous Agents
stroupp
A
license
-
quality
C
maintenance
MCP server for self-healing Selenide + Cucumber regression tests, using AI to fix failed locators and rerun tests.
Last updated 2026-06-23
MIT
playwright-fixer-mcp
Testing & QA Tools Code Analysis
BobChochola
F
license
A
quality
D
maintenance
Automated Playwright E2E test repair powered by a self-improving, governed MCP server that runs failing tests, collects failure artifacts, reasons about root causes, validates and applies fixes, and re-runs to verify.
Last updated 2026-03-11
12

View all related MCP servers

Related MCP Connectors

BugEzy
Voice-powered bug reporting with 13 MCP tools. Record bugs by talking; let AI find and fix them.
Trunk Flaky Tests
Flaky test detection, root cause analysis, and fix suggestions for development teams.
SmartBear MCP
MCP server for AI access to SmartBear tools, including BugSnag, Reflect, Swagger, PactFlow, QTM4J.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MUSE-CODE-SPACE/test-genie-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

test-genie-mcp

Vibe coders quickstart

Why test-genie?

5-minute Quickstart

Real use cases

1. React Native memory-leak self-healing

2. Flutter widget dispose() automation

3. iOS retain-cycle (closure capture)

How the iterate-fix loop works

Tools (23)

What vibe-check catches

What vibe-check misses (honest list)

vibe-check vs alternatives

When NOT to use test-genie

Comparison

Known limitations

Configuration

Migrating from v2.x

Roadmap

Contributing

Maintainer

License

Maintenance

Resources

Looking for Admin?

Tools

Related MCP Servers

AWT (AI Watch Tester)

mcp-lab-agent

ATR Healer

playwright-fixer-mcp

Related MCP Connectors

Latest Blog Posts

MCP directory API

2. Flutter widget `dispose()` automation