Skip to main content
Glama

triage_test_coverage

Audit files or directories to get a weakest-first ranked leaderboard of mutation scores, highlighting where the test suite is most fragile.

Instructions

Batch triage: audit a set of files and/or directories and return a weakest-first ranked leaderboard of mutation scores, so you can see where the test suite is most fragile in one call. Directories are recursively expanded to supported source files (.ts/.js/.py/.rs/.php), skipping test files. Files are audited serially. Drill into a weak file with audit_code_resilience for per-mutant survivor detail.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
pathsNoWorkspace-relative files and/or directories to triage. Directories are recursively expanded to supported source files. Example: ["src/utils", "src/index.ts"]
diffBaseNoAuto-scope the triage to files changed in git. "HEAD" (uncommitted), "staged", or any ref/branch/SHA (merge-base with HEAD). Makes "paths" optional: diffBase alone scans all changed supported source files; diffBase + paths intersects with those paths. TypeScript files are mutated only on changed lines; other languages run whole-file. Example: "main"
maxFilesNoCap on the number of files audited (precedence: this arg > config.defaultMaxFiles > 25). Files beyond the cap are skipped (reported in the summary). Example: 25
minScoreNoGate: if any file's mutation score is below this (0–100), the result reports gate.passed=false and lists the failing files. Never causes an error. Example: 80.
timeoutMsNoPer-file mutation-run timeout in milliseconds. Default: 300000 (5 minutes).
outputFormatNoOutput format. "json" (default) or "text".
fileConcurrencyNoHow many files to audit in parallel. Default min(4, cpus-1). When >1, each StrykerJS run's worker count is capped so total CPU use stays near the core count. Example: 4
mutatorDenylistNoStryker mutator names to exclude, applied to every TypeScript/JS file.
survivorsPerFileNoHow many top (severity-ranked, enriched) survivor groups to inline per ranked file. 0 (default) returns a scores-only leaderboard. Example: 3

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
gateNo
modeYes
noteYes
errorsYes
rankingYes
summaryYes
scopeNoteNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses key behaviors: recursive directory expansion, skipping test files, serial file auditing, and the output format (weakest-first leaderboard). It also mentions the `minScore` gate behavior. However, it does not detail all behaviors like fileConcurrency impact or response structure, which is partly covered by the output schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is compact (two sentences plus a sibling reference) and front-loaded with the core purpose. Every sentence adds value—no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 9 parameters, no required fields, and an output schema, the description sufficiently covers the tool's functionality and usage flow. It could more explicitly state that all parameters are optional and describe the summary field, but overall it is complete enough for an agent to understand the tool's role.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description adds value by explaining how parameters interact (e.g., 'diffBase alone scans all changed supported source files; diffBase + paths intersects'). It also describes the `minScore` gate behavior. This elevates understanding beyond the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's function: 'Batch triage: audit a set of files and/or directories and return a weakest-first ranked leaderboard of mutation scores'. It differentiates from the sibling 'audit_code_resilience' by specifying that it provides a high-level overview, not per-mutant details.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes explicit guidance: 'Drill into a weak file with audit_code_resilience for per-mutant survivor detail.' This clearly tells the agent when to use the sibling tool. It also implies usage context (finding fragile areas) but lacks explicit exclusion scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AraneaDev/Chaos-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server