Skip to main content
Glama

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault

No arguments

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
sumo_qa_explain_test_data_requirements

Explain what test data shape and characteristics are needed for a scenario.

Returns: required entity characteristics, resource-state conditions, scenario preconditions, downstream dependencies, edge cases, and explicit "what NOT to use" guidance. Domain-neutral by design — works for any domain (auth, billing, retail, infrastructure, ML, etc.). Optional environment (e.g. "integration") and domain are folded into the analysis.

Common natural-language phrasings that map to this tool: "what data do I need to test X", "what test data should I look for to cover X", "what records / accounts / fixtures do I need for X", "what's the minimum data setup for X", "what edge-case data should I test".

sumo_qa_find_test_data

Search the local known-good test data catalogue for entries that match a scenario.

Returns: ranked matches with confidence, freshness, and suitability reasons. Reads the local YAML catalogue under knowledge/test_data/ only; no external lookups. Optional scenario_tags and known_valid_for narrow the search.

Pagination: pass offset to skip the first N matches, and read total_count, has_more, and next_offset on the response to walk pages. When has_more is false, next_offset is null.

Common natural-language phrasings that map to this tool: "find me test data for X", "do we have a known-good record for X", "give me an account / fixture / record that does X", "is there a fixture for X", "what test data is available for X".

sumo_qa_validate_test_data

Validate a test data entry without provisioning or mutating downstream systems.

Returns: validation result with confidence level, freshness status, and an explained reason. Accepts either entry_id (looked up in the catalogue) or entry (a full record dict).

Common natural-language phrasings that map to this tool: "is this test data still valid", "validate this record", "is entry X still good", "check if X is fresh".

sumo_qa_register_known_good_test_data

Add or update a known-good test data entry in the local YAML catalogue.

Detects duplicates by environment + domain + product/SKU + scenario overlap. Writes to knowledge/test_data/<domain>/known_good.yaml.

Arg shape — pass entry as a literal dict, NOT a YAML string. Example:

sumo_qa_register_known_good_test_data(entry={
    "id": "billing-overdue-invoice-001",
    "environment": "staging",
    "domain": "billing",
    "scenario_tags": ["overdue_invoice", "dunning_eligible"],
    "known_valid_for": ["dunning workflow testing"],
    "constraints": ["Reset overdue flag after test."],
    "owner": "billing-platform",
    "last_validated_at": "2026-05-16T09:00:00Z",
    "confidence": "high",
    "source": "qa-curated",
    "notes": "Overdue invoice usable for dunning-flow testing.",
})

Common natural-language phrasings that map to this tool: "save this as known-good test data", "register this fixture so the team can reuse it", "promote this record to known-good", "update the validated timestamp on entry X", "add this record to the catalogue".

sumo_qa_ingest_knowledge_pack

Adds or replaces team QA knowledge/standards/rules from a local native file.

Accepts a path to a native sumo-qa file or a directory of them: principles.md, techniques.md, classifications.md, approaches.md, a standards-pack *.yaml, or change_rules.yaml. Validates the content and writes a normalized copy into a user-writable pack. The scope argument selects where it lands: 'project' (/.sumo-qa, the current repo only) or 'global' (the user data dir, every repo) — the right scope is a user choice worth confirming. Loader precedence is env var > project > global > bundled > repo root.

A PDF / PPTX / URL or any other non-native source is not parsed here; it returns an unsupported_source result that routes through the sumo-qa-suggesting-external-skill flow to convert the source to markdown, which is then re-ingested with an explicit content_type.

Common natural-language phrasings that map to this tool: "add this to the knowledge base", "replace our principles", "load our team standards pack", "use these change rules", "ingest this QA pack".

sumo_qa_capture_review_feedback

Manage an EXPLICIT, user-confirmed review-feedback memory of recurring QA findings.

Promotes a recurring review lesson (e.g. "we always miss timezone boundaries in billing") into a local, inspectable, reversible memory that future planning/review skills consult as an ADVISORY hint — NOT automatic learning. action selects the operation:

  • 'capture' — add a new lesson (or replace one with the same id). Requires entry with scope, trigger_signal, recommended_probe, source_note, and optional last_reviewed (ISO-8601; defaults to now).

  • 'update' — replace the fields of an existing lesson; needs entry_id plus entry.

  • 'delete' — remove a lesson by entry_id.

  • 'list' (default) — return stored lessons, advisory-flagged. The scope default is the literal 'project', so it lists the current repo; pass scope='global' for the cross-repo set. An unrecognised scope returns an error envelope (it is never coerced to project).

NEVER persist without explicit user confirmation, and NEVER auto-capture from a review/prompt/trace. That confirmation gate is the HOST/skill's responsibility, not enforced by a tool parameter — the deliberate writer-local data-ownership model shared with the risk-ledger and AC tools; the sumo-qa-feedback CLI correspondingly exposes only list/delete, so a capture can never be a fire-and-forget flag. Sensitive input — a raw diff hunk, a secret, a code snippet, or a pasted full issue/PR body — is REJECTED; only the user's own summary is stored, and a rejected entry is never echoed to the debug-capture sink either. Storage reuses the #92 user-writable pack location (project = /.sumo-qa, global = the user data dir) under a feedback/ subdir, so it is NOT a second hidden tree. Memory-derived probes are ADVISORY: cite them SEPARATELY from bundled ISTQB/rules content; they never override canonical classifications or change-rules.

Common natural-language phrasings that map to this tool: "remember that we always miss X in Y", "save this review lesson", "promote this recurring finding to team memory", "what review lessons have we saved?", "forget the timezone-billing lesson".

sumo_qa_load_classifications

Return the canonical change classifications as plain text. The host LLM picks which apply to a given change.

sumo_qa_load_approaches

Return the canonical QA approaches as plain text. The host LLM picks which approach fits a given piece of work.

sumo_qa_load_principles

Return ISTQB Foundation + Advanced + ISO 25010 grounding as plain text. The host LLM cites principles when shaping recommendations.

sumo_qa_load_techniques

Return the test design technique catalogue (black-box, white-box, experience-based, static, property-based, mutation) as plain text. The host LLM picks one technique per named risk.

sumo_qa_load_standards

Return the team's loaded standards packs as plain text. Optional classification filter is metadata-based and accepts comma-separated values (packs whose frontmatter declares any requested classification); no keyword inference.

sumo_qa_load_rules

Return the team's loaded change rules as plain text. Optional classification filter accepts single or comma-separated values and returns matching rule entries; no keyword inference.

sumo_qa_load_catalogue_entry

Load a single catalogue entry, or a whole catalogue in compact form, as a JSON string — a lighter alternative to the full-text loaders for one of the four prose catalogues: classifications, approaches, principles, techniques.

  • With name set: return one entry. name matches the stable slug id (api_contract_change, equivalence-partitioning) or the verbatim heading text (case-insensitive).

  • With name omitted: return the whole catalogue. format="full" (default) returns the verbatim catalogue text; format="compact" returns one lead-line summary per entry.

format: "full" (default) returns verbatim entry text marked canonical=true — safe to cite. "compact" returns a truncated summary marked canonical=false — a navigation/recall aid, NOT a citation replacement; load the full form (or the zero-argument sumo_qa_load_* loader) when exact wording matters.

Never raises: an unknown catalogue, name, or format returns a JSON error envelope listing the valid choices. The existing zero-argument sumo_qa_load_* loaders are unchanged. Read-only and local-only.

sumo_qa_capabilities

Return a compact, read-only map of sumo-qa's core QA workflows — each with a sample prompt, the skill it routes to, and a one-line outcome. A discovery aid for "what can sumo-qa do?"; does NOT replace the using-sumo-qa entry router or sumo_qa_deciding_approach.

sumo_qa_scan_repo

Walk a repository and return a compact summary of its QA-relevant shape: per-type node counts, likely_tests edge counts by confidence, command counts by kind, warning counts by kind. Optionally writes the full schema-validated .sumo-qa/repo-map.json artifact to disk via write_to.

Common natural-language phrasings that map to this tool: "map this repo", "scan the repo and tell me what's here", "give me a QA-shaped inventory of this project", "what tests exercise what sources in this repo", "generate the repo-map artifact for X".

root is the repository to scan (absolute or relative to the MCP server's working directory). generator_version defaults to the installed sumo-qa version. write_to is optional — when set, the full artifact is written to that JSON path, deterministic on the same repo state except for project.generated_at.

sumo_qa_analyze_diff_impact

Map a set of changed files onto the repo-map to report which tests likely exercise them, which changed sources have no mapped test (the risk surface), one-hop affected nodes, unmapped files, and whether the map is stale relative to HEAD.

Common natural-language phrasings that map to this tool: "what does this diff affect", "which tests cover my changes", "what's the risk surface of this branch", "what should I re-test after these edits", "analyse the impact of the changes against main".

root is the repository. Supply changed_files (repo-relative paths) OR base_ref (any git ref; changed files are the diff against the merge-base of base_ref and HEAD, so changes that landed on the base after the branch diverged don't leak in). The repo-map is read from artifact_path when present and falls back to a live scan otherwise; an artifact for a different project root is ignored. On the first run of an unmapped repo the live scan is persisted to artifact_path (reported as persisted_map_path) unless artifact_path is None. write_overlay writes .sumo-qa/diff-impact.json under root. When test files exist but the map has no likely_tests edges, probable_mapping_gap flags the risk surface as a missed-convention gap rather than true zero coverage.

sumo_qa_query_repo_map

Search the repo-map for the components, tests, CI checks, configs, or commands that match a query, returning a bounded, ranked list with enough metadata (id, path, type, tags, match reason) to open the files directly — never the full artifact.

Common natural-language phrasings that map to this tool: "find the repo-map node for X", "which tests are mapped to the billing module", "list the CI workflows in the map", "what commands does the repo-map know about", "search the map for files tagged mcp".

root is the repository. query matches case-insensitively across node id, path, file name, type, category, and tags, and across command names and kinds; results rank exact identity above substring hits. limit caps the returned matches (total_matches still reports the full count). types restricts the search to given node types and/or the literal "command". The repo-map is read from artifact_path when present and falls back to a live scan otherwise; an artifact for a different project root is ignored.

sumo_qa_format_risk_ledger

Validate and render a risk-to-test traceability ledger as a markdown appendix (issue #144). FILE/FORMAT PLUMBING ONLY — the host LLM identifies the risks; this tool never infers them.

Each row is a dict with: risk_id (stable within this response), risk (the statement), source_anchor (file:line or domain term), test (a test id OR a 'planned: …' check), evidence_status (one of planned / passing / failing / stale / accepted_residual), residual (one of open / accepted / mitigated / blocker), and an optional repo_map_node_id linking to a .sumo-qa/repo-map.json node.

Returns the rendered markdown table (the structured appendix the markdown-first verdict carries), a one-line compact summary, the row count, and the count of uncovered blockers (rows that are not passing, not accepted, and marked residual=blocker — the signal the review workflow uses to refuse safe-to-merge). The table is bounded by max_rows so a large ledger stays inside the host token budget.

sumo_qa_format_context_bundle

Validate and render a host-neutral issue/PR CONTEXT BUNDLE as a compact markdown brief for QA review/planning (issue #149). FILE/FORMAT PLUMBING ONLY — the host gathers the facts; this tool never inspects a repo, makes a network call, or assumes GitHub. A partial/empty bundle is first-class: when little is supplied, the consuming skill falls back to direct repo inspection.

Common natural-language phrasings that map to this tool: "build the review context bundle", "format this PR/issue context for review", "render the context bundle with its freshness", "summarise the diff/CI/test facts I gathered".

bundle is a dict with optional issue_summary, pr_summary, head_sha, changed_files (each {path, change_kind}), test_evidence / ci_status (each {result, freshness, source}, plus optional captured_at / detail), and user_constraints. freshness is one of fresh/stale/unknown/absent; only a FRESH PASS is safety-supporting — a stale, unknown, or absent fact is rendered with an explicit "do not claim safety from it" warning. Supply local_head_sha (the host's live local head) to detect a bundle-vs-local-state conflict; when the shas differ the brief calls out the divergence instead of trusting either side. max_files bounds the changed-file list.

sumo_qa_export_test_cases

Deterministically EXPORT already-structured QA test cases into one documented machine-readable shape (issue #148). FILE/FORMAT PLUMBING ONLY — the host LLM identifies the cases; this tool never infers them and never inspects a repo. By DEFAULT it is side-effect free (it RETURNS the rendered text and writes nothing); a file is persisted ONLY when an explicit output_path is supplied.

Each case is a dict with: id (stable within this export), title, preconditions (ordered list, may be empty), steps (ordered list, may be empty), expected_result, optional linked_risk_id (a risk id in a companion risk ledger), priority (one of critical / high / medium / low), and evidence_status (one of planned / passing / failing / stale / accepted_residual — the same vocabulary as the risk ledger).

format is one of: markdown (the DEFAULT human-facing table), json (a versioned, key-sorted, deterministic document), or csv (OPTIONAL, and only valid for a flat outline — at most one precondition and one step per case). An unsupported format, or CSV for a non-flat export, returns an error envelope naming the supported formats. Tool- specific import mappings may need local adjustment.

export_title (optional) names the export as a whole — rendered in the markdown header and the JSON top-level title. (It is named export_title, not title, so it is distinct from each case's own title and survives the served-schema title-slimming pass.)

output_path (optional) is the EXPLICIT file-write carve-out. When omitted (the default) nothing is written. When given, the SAME rendered bytes are ALSO persisted, confined to the project export root (<cwd>/.sumo-qa/exports): a relative path resolves under that root, an absolute path or .. traversal that escapes it is refused, and an already-existing target is refused rather than silently overwritten. The write only happens AFTER successful validation+render, so a bad export never leaves a file. On a successful write written_path carries the resolved absolute path (else None).

Returns the rendered content, the chosen format, the stamped schema_version, the validated test_case_count, and written_path (the persisted location, or None on the default no-write path).

sumo_qa_search_external_skills

Search the Skills CLI registry for external agent skills.

Returns the ANSI-stripped CLI output verbatim plus a one-line hint on how to read it. No structured parsing — the host LLM interprets the raw text so format drift in the Skills CLI doesn't break the flow.

sumo_qa_check_external_skill_installed

Locate an installed external SKILL.md file for Codex, Claude, or agents paths.

Returns the first matching path for project or global skill locations, or null when the skill is absent.

sumo_qa_install_external_skill

Install an external agent skill through the Skills CLI.

The confirmed flag records that the host received explicit user approval before invoking the install operation.

sumo_qa_execute_external_skill

Load an installed external SKILL.md and return the execution handoff.

The payload contains the skill body plus the original intent so the host can follow the external workflow in the current conversation.

sumo_qa_list_skill_manifests

Return deterministic metadata for every bundled sumo-qa skill as a JSON string — a routing/index aid, NOT the skill bodies.

detail controls how much per-skill index is included (default "compact"):

  • "compact" — routing metadata only: skill_name, tool_name (the zero-argument skill tool), description (from frontmatter), content_hash (sha256 of the SKILL.md) and estimated_tokens_full. NO sections[]/modules[] arrays — the cheap all-skill routing slice.

  • "full_index" — the same metadata PLUS each skill's sections[] (id, heading, level, estimated_tokens, required) and modules[] (id, path, estimated_tokens) index arrays. Section ids are stable heading slugs (duplicates get -2/-3 suffixes); required marks the structural sections (frontmatter, Iron Law, Checklist, Flow, Red Flags, HARD-GATE) when present.

Once routing has chosen one skill, fetch that skill's section/module index with sumo_qa_load_skill_context(skill_name, mode="manifest"), then a single slice via mode="section"/"module"/"full".

An unrecognised detail returns a JSON error envelope listing the valid values rather than raising. Read-only and local-only: no network, no extraction, no caching. The existing zero-argument skill tools still return full bodies unchanged.

sumo_qa_load_skill_context

Load just one slice of a skill's context as a JSON string, instead of the whole SKILL.md body.

mode:

  • "manifest" — routing summary + section list + module list;

  • "section" — one section's text (pass section, an id from the manifest);

  • "module" — one module's text (pass module, an id from the manifest);

  • "full" — the entire SKILL.md body, byte-for-byte identical to the existing zero-argument skill tool for skill_name.

The section/module/full slices each return content_hash (sha256 of the returned text) and estimated_tokens. Pass known_hash to ask "has this slice changed since hash X?": a match returns changed=false with the body omitted (saving the re-send), a mismatch returns changed=true with the body. This is derived per call — there is NO hidden session cache, so it is safe across hosts regardless of MCP session identity.

Never raises: an unknown skill_name/mode/section/module, a missing required arg, or a path-traversal attempt returns a JSON error envelope listing the valid choices. Read-only and local-only.

sumo_qa_answering_testing_question

Use when the user asks a generic testing question — "how do I test this?", "what should I check for X?" — that doesn't fit a more specific QA skill. Cites a principle or technique from the loaded catalogue rather than producing generic advice.

sumo_qa_creating_test_plan

Use when the user asks for a formal test plan, entry/exit criteria, or a phased QA approach for a piece of work. Walk the user through scope → risks → entry criteria → phases → exit criteria → residual risks one section at a time, getting confirmation before each step. Heavier than sumo-qa-preparing-for-work; use when the work is tracked or formally reviewed.

sumo_qa_deciding_approach

Use as the FIRST step on any QA intent. Loads classifications + approaches (the two needed to route), then reasons over the user's intent to pick the canonical approach and routes to the matching sub-skill (which loads any further catalogues on demand).

sumo_qa_executing_qa_rollout

Use after sumo-qa-planning-qa-rollout to dispatch a written QA plan task-by-task. Each task runs in a fresh subagent (parallel where independent); each subagent's output goes through a two-stage review (test-correctness → test-quality) before the task is marked done. Continuous execution — no per-task check-ins. Finishes by routing to sumo-qa-finishing-qa-work.

sumo_qa_finding_test_data

Use when the user asks about test data — what data to test X, find a known-good record, validate an entry, register new known-good data. Routes between sumo_qa_explain_test_data_requirements, sumo_qa_find_test_data, sumo_qa_validate_test_data, and sumo_qa_register_known_good_test_data.

sumo_qa_finishing_qa_work

Use at the end of a QA rollout (after sumo-qa-executing-qa-rollout, or after a manual multi-step QA task) to capture evidence, produce a PR-ready summary, and close the loop. Runs the suite one last time, captures coverage / risk-to-test map / open follow-ups, writes a markdown summary to docs/qa/runs/YYYY-MM-DD-.md, and offers to draft the PR description.

sumo_qa_implementing_with_tdd

Use after sumo-qa-deciding-approach picks tdd-scaffold, regression-first, or coverage-first-then-refactor — e.g. "write a regression test for this bug" or "scaffold the failing tests first". Walks plan → name-the-risk-and-test-idea → confirm → red → hand off → green → review, one section per turn with confirmation gates. Don't write the test until the test idea has been agreed.

sumo_qa_planning_qa_rollout

Use when you have a chunk of QA work (a story, a PR, a strategy phase) that needs to be turned into a written plan of bite-sized, independently-executable tasks before any test code is written. Walks scope → file structure → bite-sized tasks → confirm, one section per turn. Produces docs/qa/plans/YYYY-MM-DD-.md ready for subagent dispatch via sumo-qa-executing-qa-rollout.

sumo_qa_preparing_for_work

Use when the user asks to plan QA for a story, ticket, or piece of work before coding starts. Identifies named risks anchored in the change shape, then proposes a smallest useful test set tied to those risks. Lighter-weight than sumo-qa-creating-test-plan; no formal entry/exit criteria.

sumo_qa_reviewing_before_merge

Use when the user asks "review my changes" / "is this safe to merge" / "what could break". Reads the diff and the changed files first, surfaces what was found + named risks, runs tests, then delivers the verdict — section by section with confirmation gates, not as one dump. Refuses to claim safe-to-merge without fresh verification evidence.

sumo_qa_strategising

Use for repo-wide / policy-shaped asks — "audit our test coverage", "design our QA strategy from scratch", "where should we invest QA effort first", "design our test pyramid". Walks repo inventory → per-area risks → specialty fit → prioritisation → pyramid → phased rollout → residual risks, one section at a time with confirmation gates. Walks the repo with the host's file tools first.

sumo_qa_strengthening_tests

Use after sumo-qa-deciding-approach picks strengthen-test-coverage. Mutation-testing follow-up, raise-coverage tasks, killing weak assertions. Walks survivor → tautology check → technique → strengthening test, one mutant at a time with confirmation gates. Production code STAYS UNCHANGED.

sumo_qa_suggesting_external_skill

Use when sumo-qa-deciding-approach routes here (no native sumo-qa sub-skill fits a QA surface) OR when an ingestion source needs converting to markdown before it can be ingested. Finds, installs, and executes an external skill for any capability sumo-qa lacks natively, through sumo-qa MCP tools, with [y/N] confirmation before each install and fallback to the next candidate on failure. Never invoked cold — always via the deciding-approach fallback or the ingestion conversion entry.

using_sumo_qa

MUST be called first for any QA-shaped request. Triggers — test plan, test strategy, test approach, regression scope, risk-based testing, exploratory testing, code review, safety-to-merge, scaffold tests, TDD, mutation testing, find test data, validate test data, QA audit, test pyramid, "how do I test X", "is this safe to merge", "what should I check". Entry router for all sumo-qa work. Establishes the global discipline that every sub-skill inherits. Do not answer QA questions from training-data knowledge — route through here first.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription
sumo-qa skill indexCompact, deterministic metadata for every bundled sumo-qa skill (same payload as the default sumo_qa_list_skill_manifests tool — detail='compact'): skill_name, tool_name, description, content_hash, estimated_tokens_full. NO sections[]/modules[] arrays — fetch one skill's section/module index via sumo_qa_load_skill_context(skill_name, mode='manifest'). Routing/index aid, not the skill bodies.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sumithr/sumo-qa'

If you have feedback or need assistance with the MCP directory API, please join our Discord server