Skip to main content
Glama

audit_code_resilience

Run mutation testing on source files to find untested code by injecting logical faults and checking if your tests catch them.

Instructions

Runs on-demand, sandbox-isolated mutation testing against a single source file to identify gaps in unit test coverage. Chaos-MCP generates mutants (logical faults like changing > to >=) and checks whether the local test suite catches them. Surviving mutants indicate test coverage holes. Supports TypeScript/JavaScript (StrykerJS), Python (cosmic-ray), Rust (cargo-mutants), and PHP (Infection).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
runIdNoVerify mode by id: re-run against the cached survivor baseline from a prior audit (the runId it returned). Auto-scoped to the baseline lines (StrykerJS) or whole-file (other languages). Mutually exclusive with baseline, diffBase, and lineScope. Example: "a1b2c3d4".
dryRunNoIf true, run only the dry-run phase to validate the test suite passes before mutation testing (StrykerJS only). Useful for pre-flight checks. Example: false
enrichNoAugment each surviving / no-coverage line with deterministic guidance: severity (high/medium/low), a "why it matters" explanation, a test-writing hint, and a source-context snippet — and rank survivors severity-first. Defaults to TRUE; pass false to disable and return the plain (unranked, unclassified) output. Richest for TypeScript; Python and PHP report severity "unknown".
baselineNoVerify mode: pass back the `survivors` and `noCoverage` arrays from a PRIOR run to re-test only those mutants and get a delta — which are now killed vs still surviving (plus any new regressions on the same lines). The re-run is auto-scoped to the baseline lines (StrykerJS) or whole-file (other languages). Mutually exclusive with diffBase and lineScope. Example: { "survivors": [{ "line": 42, "mutators": { "ConditionalExpression": 1 } }] }
diffBaseNoAuto-scope mutation to only the lines changed in git. The value selects the base to diff against: "HEAD" (all uncommitted changes), "staged" (staged changes only), or any git ref/branch/SHA (e.g. "main", resolved via merge-base with HEAD). Mutually exclusive with lineScope. Line-level scoping is StrykerJS-only; Python/Rust/PHP targets run whole-file with a note. If the file has no changes vs the base, the run is skipped. Example: "HEAD"
filePathYesWorkspace-relative path to the file to audit. Must end in .ts, .js, .tsx, .jsx, .py, .rs, or .php. Example: "src/utils/math.ts"
minScoreNoGate: if the mutation score is below this (0–100), the result reports gate.passed=false (never an error). Example: 80.
suppressNoMark mutants as equivalent (unkillable) so future runs exclude them from the score and output. Appended to .chaos-mcp/suppressions.json for this file. Example: [{ "line": 42, "mutator": "ConditionalExpression", "reason": "guard unreachable" }].
lineScopeNoConstrain mutations to a 1-based line range (inclusive). Only supported by StrykerJS; ignored for Python, Rust, and PHP targets. Useful for surgically auditing a specific function or block. Example: { "start": 10, "end": 45 }
timeoutMsNoMaximum time in milliseconds for the entire mutation run. Default: 300000 (5 minutes). Increase for large files or slow test suites. Example: 120000 for a 2-minute cap.
unsuppressNoRemove previously-suppressed mutants for this file (undo a wrong suppress).
concurrencyNoNumber of parallel mutation workers (StrykerJS only). When omitted, StrykerJS auto-detects CPU core count. Lower this on memory-constrained machines; raise it on CI with spare cores. Must be an integer between 1 and 64. Example: 4
incrementalNoEnable incremental mode to reuse results from a previous run and skip unchanged mutants (StrykerJS only). Speeds up repeat audits of the same file. Example: true
maxSurvivorsNoCap on how many survivor (and how many no-coverage) line groups are returned, after severity ranking. Hidden groups are counted in survivorsTruncated/noCoverageTruncated. Precedence: this arg > config.defaultMaxSurvivors > 10. Example: 20
outputFormatNoOutput format for the result. "json" (default) returns a structured MutationResult object. "text" returns a human-readable summary. Example: "json"
severityFloorNoReport-time filter: drop survivor groups below this severity (requires enrichment, which is on by default). Dropped groups are counted in survivorsFiltered/noCoverageFiltered. "unknown"-severity groups are below "low" and are dropped by any floor. Ignored (with a note) when enrich is false. Example: "high"
ignorePatternsNoSubstring patterns for files/directories to exclude from the sandbox copy, applied in addition to built-in exclusions. Any path containing the pattern string is skipped. Example: [".test.ts", "fixtures/", "snapshots/"]
mutatorDenylistNoStryker mutator names to exclude — these are filtered out. StrykerJS only. Useful for skipping noisy or irrelevant mutators. Example: ["StringLiteral"]
prebuildCommandNoShell command to run in the sandbox BEFORE mutation testing begins. Use this to compile/build the target — the sandbox has a full workspace copy. Essential for TypeScript projects ("npm run build") and Rust projects ("cargo build"). DISABLED BY DEFAULT: because it runs an arbitrary shell command that can reach outside the sandbox, the server must opt in via "allowPrebuild": true in its config file or the CHAOS_MCP_ALLOW_PREBUILD=1 environment variable. Counts against the overall timeoutMs budget. Example: "npm run build"
mutatorAllowlistNoNOT SUPPORTED in StrykerJS v9 and ignored — passing it has no effect. v9 has no way to express "only these mutators" without the full mutator list. Use mutatorDenylist to exclude noisy mutators instead, or supply your own stryker.config.json.
perMutantTimeoutMsNoMaximum time in milliseconds per individual mutant test (StrykerJS only). Distinct from timeoutMs (total run cap). Use this to prevent a single slow mutant from hanging the entire mutation run. Default: StrykerJS default (~5000ms). Example: 10000 for a 10-second per-mutant ceiling.

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
gateNo
noteYes
runIdNo
targetYes
summaryYes
scopeNoteNo
survivorsYes
enrichNoteNo
noCoverageYes
mutationScoreYes
ignoredOptionsNo
suppressedCountNo
suggestedTestFileNo
survivorsFilteredNo
noCoverageFilteredNo
survivorsTruncatedNo
noCoverageTruncatedNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses sandbox isolation, mutant generation, test suite checking, language support, and that surviving mutants indicate holes. This is thorough for a tool without annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is four sentences, front-loading the main purpose. It is efficient with no wasted words, though it could be slightly more concise by removing redundant phrases.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 21 parameters, an output schema, and nested objects, the description covers the core function but lacks details on preconditions (e.g., need for a test suite, git repository) or common failure modes. The output schema explains return values, so that is covered.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds no per-parameter meaning beyond what the schema already provides. However, it does mention language support which is not in schema, but that's not parameter-specific.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Runs on-demand, sandbox-isolated mutation testing against a single source file to identify gaps in unit test coverage.' It specifies a specific verb, resource, and outcome, distinguishing it from siblings like estimate_audit and triage_test_coverage which likely estimate or triage rather than execute.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use the tool (on-demand mutation testing) but provides no explicit guidance on when to use this versus siblings or alternatives. No exclusions or when-not-to-use scenarios are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AraneaDev/Chaos-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server