Skip to main content
Glama
temurkhan13

openclaw-output-vetter-mcp

verify_action_outcome

Compare an agent's claimed outcome against actual before/after state snapshots to detect misreported actions.

Instructions

v1.1+ — Compare an agent's stated outcome against actual before/after state snapshots. Catches the [@chiefofautism, 158↑] failure mode: agent runs rm -rf / git push --force and then says 'I cleaned up the project structure' — bash-vet catches the destructive command, this checks the misreport about what got done. Also catches the Codex-CoT sandbox-escalation pattern: agent acknowledges read-only constraint, then writes anyway (pass read_only: true in the before snapshot). Pure function — caller captures snapshots; server is stateless. Returns ActionOutcomeReport with verdict (CLEAN / PARTIALLY_GROUNDED / FABRICATED / UNVERIFIED) + per-mismatch evidence.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
claimYesThe agent's stated outcome — verbatim. Examples: 'I cleaned up the project structure', 'tests pass', 'committed and pushed', 'created auth_v2.py'.
before_snapshotYesCaller-captured state BEFORE the agent acted. Recognized keys: files (list[str]), git_status, git_tip / git_head / git_log_tip (str SHA), tests_status / test_status, read_only (bool — asserts no-write constraint). Other keys are tracked but not matched against claim.
after_snapshotYesCaller-captured state AFTER the agent acted. Same key conventions as before_snapshot.
expected_changesNoOptional caller-supplied list of expected changes. Recognized formats: 'file:foo.py:added', 'file:bar.py:removed', 'git:committed', 'git:clean', 'tests:pass'. Each missing entry becomes a MISSING_EXPECTED_CHANGE finding.

Implementation Reference

  • Main handler function that compares an agent's claim against before/after state snapshots, extracts assertions, checks them against the diff, and returns an ActionOutcomeReport with verdict and mismatches.
    def verify_action_outcome(
        claim: str,
        before_snapshot: Mapping[str, Any],
        after_snapshot: Mapping[str, Any],
        expected_changes: list[str] | None = None,
    ) -> ActionOutcomeReport:
        """Compare an agent claim against actual before/after state diff.
    
        Pure function — does not capture state itself; the caller passes both
        snapshots. The server stays stateless (same posture as `verify_grounding`).
    
        Both snapshots must be Mapping-shaped; non-dict inputs should be coerced
        by the call site (server.py does this for MCP tool calls).
        """
        diff = _compute_diff(before_snapshot, after_snapshot)
        diff_summary = _diff_summary(diff)
    
        assertions = _extract_claim_assertions(claim)
    
        mismatches: list[ActionOutcomeMismatch] = []
        matched = 0
        mismatched = 0
    
        for kind, target, excerpt in assertions:
            m = _check_assertion(kind, target, excerpt, diff, before_snapshot, after_snapshot)
            if m is None:
                matched += 1
            else:
                mismatched += 1
                mismatches.append(m)
    
        # Always check constraint violations + caller-supplied expected_changes
        mismatches.extend(_check_constraint_violations(before_snapshot, diff))
        mismatches.extend(_check_expected_changes(expected_changes, diff))
    
        # Sort: CRITICAL → HIGH → MEDIUM → LOW → INFO, then by rule_id
        severity_rank = {
            Severity.CRITICAL: 0,
            Severity.HIGH: 1,
            Severity.MEDIUM: 2,
            Severity.LOW: 3,
            Severity.INFO: 4,
        }
        mismatches.sort(key=lambda m: (severity_rank[m.severity], m.rule_id))
    
        # Verdict composition
        has_critical = any(m.severity == Severity.CRITICAL for m in mismatches)
        has_high = any(m.severity == Severity.HIGH for m in mismatches)
    
        if not assertions and not expected_changes and not _check_constraint_violations(before_snapshot, diff):
            verdict = Verdict.UNVERIFIED
            summary = (
                "UNVERIFIED — claim has no extractable assertions and no "
                "expected_changes / constraint were supplied. Provide a more "
                "specific claim (filename, 'tests pass', 'committed', etc.) or "
                "pass expected_changes."
            )
        elif mismatched == 0 and not mismatches:
            verdict = Verdict.CLEAN
            summary = (
                f"CLEAN — claim is supported by the diff. "
                f"{matched} assertion(s) matched. Diff: {diff_summary}."
            )
        elif has_critical or (has_high and matched == 0):
            verdict = Verdict.FABRICATED
            summary = (
                f"FABRICATED — claim is contradicted by the diff. "
                f"{matched} matched / {mismatched} mismatched. "
                f"Worst: {mismatches[0].rule_id}. Diff: {diff_summary}."
            )
        else:
            verdict = Verdict.PARTIALLY_GROUNDED
            summary = (
                f"PARTIALLY_GROUNDED — some claim assertions match the diff, others don't. "
                f"{matched} matched / {mismatched} mismatched. Diff: {diff_summary}."
            )
    
        return ActionOutcomeReport(
            verdict=verdict,
            matched_count=matched,
            mismatched_count=mismatched,
            mismatches=mismatches,
            diff_summary=diff_summary,
            summary=summary,
        )
  • Output type (ActionOutcomeReport) and input validation via ActionOutcomeMismatch for the verify_action_outcome tool.
    class ActionOutcomeReport(BaseModel):
        """Response for `verify_action_outcome` — compares an agent claim against before/after state diff.
    
        The scanner is the next layer below `review_transcript`'s
        `unverified-completion-claim` check: that one fires when a claim has *no
        supporting tool calls visible in the transcript*. This one fires when a
        claim *has* supporting tool calls, but the side effects don't match.
        """
    
        model_config = ConfigDict(frozen=True)
    
        verdict: Verdict
        """CLEAN if all extracted claims match the diff; PARTIALLY_GROUNDED if some
        match and some don't; FABRICATED if the diff actively contradicts the
        claim (state unchanged or violated stated constraint); UNVERIFIED if the
        claim couldn't be parsed into checkable assertions."""
        matched_count: int
        """Claim assertions that matched the diff."""
        mismatched_count: int
        """Claim assertions that did not match the diff."""
        mismatches: list[ActionOutcomeMismatch]
        """All mismatches, sorted CRITICAL → INFO."""
        diff_summary: str
        """One-line text summary of what changed between the snapshots."""
        summary: str
  • MCP tool registration in list_tools() with name 'verify_action_outcome', description, and inputSchema defining claim, before_snapshot, after_snapshot (required) and expected_changes (optional).
    Tool(
        name="verify_action_outcome",
        description=(
            "v1.1+ — Compare an agent's stated outcome against actual "
            "before/after state snapshots. Catches the [@chiefofautism, 158↑] "
            "failure mode: agent runs `rm -rf` / `git push --force` and then "
            "says 'I cleaned up the project structure' — bash-vet catches the "
            "destructive command, this checks the *misreport* about what got "
            "done. Also catches the Codex-CoT sandbox-escalation pattern: "
            "agent acknowledges read-only constraint, then writes anyway "
            "(pass `read_only: true` in the before snapshot). Pure function — "
            "caller captures snapshots; server is stateless. Returns "
            "ActionOutcomeReport with verdict (CLEAN / PARTIALLY_GROUNDED / "
            "FABRICATED / UNVERIFIED) + per-mismatch evidence."
        ),
        inputSchema={
            "type": "object",
            "properties": {
                "claim": {
                    "type": "string",
                    "description": (
                        "The agent's stated outcome — verbatim. Examples: "
                        "'I cleaned up the project structure', 'tests pass', "
                        "'committed and pushed', 'created auth_v2.py'."
                    ),
                },
                "before_snapshot": {
                    "type": "object",
                    "description": (
                        "Caller-captured state BEFORE the agent acted. "
                        "Recognized keys: files (list[str]), git_status, "
                        "git_tip / git_head / git_log_tip (str SHA), "
                        "tests_status / test_status, read_only (bool — "
                        "asserts no-write constraint). Other keys are "
                        "tracked but not matched against claim."
                    ),
                },
                "after_snapshot": {
                    "type": "object",
                    "description": (
                        "Caller-captured state AFTER the agent acted. Same "
                        "key conventions as before_snapshot."
                    ),
                },
                "expected_changes": {
                    "type": "array",
                    "description": (
                        "Optional caller-supplied list of expected changes. "
                        "Recognized formats: 'file:foo.py:added', "
                        "'file:bar.py:removed', 'git:committed', "
                        "'git:clean', 'tests:pass'. Each missing entry "
                        "becomes a MISSING_EXPECTED_CHANGE finding."
                    ),
                    "items": {"type": "string"},
                },
            },
            "required": ["claim", "before_snapshot", "after_snapshot"],
        },
    ),
  • Tool dispatch in call_tool(): routes the 'verify_action_outcome' tool name to the verify_action_outcome handler with argument extraction and type coercion.
    if name == "verify_action_outcome":
        claim = str(arguments.get("claim", "")).strip()
        before = arguments.get("before_snapshot")
        after = arguments.get("after_snapshot")
        expected = arguments.get("expected_changes")
        if not isinstance(before, dict):
            before = {}
        if not isinstance(after, dict):
            after = {}
        if expected is not None and not isinstance(expected, list):
            expected = None
        return _serialize(verify_action_outcome(claim, before, after, expected_changes=expected))
  • Claim extraction helper that parses agent claim text using regex patterns (created_file, deleted_file, tests_pass, committed, clean_state, vague_completion) and supports multi-target expansion for chained filenames.
    def _extract_claim_assertions(claim: str) -> list[tuple[str, str | None, str]]:
        """Parse claim text into list of (kind, target_or_none, claim_excerpt) tuples."""
        if not claim or not claim.strip():
            return []
    
        assertions: list[tuple[str, str | None, str]] = []
        seen_specific = False
    
        for kind, pattern in _CLAIM_PATTERNS:
            for m in pattern.finditer(claim):
                target = m.group(1) if m.groups() else None
                excerpt = claim[max(0, m.start() - 10) : min(len(claim), m.end() + 30)].strip()
                if len(excerpt) > 200:
                    excerpt = excerpt[:200] + "..."
                if kind == "vague_completion" and seen_specific:
                    # Skip vague matches if we already have specific ones
                    continue
                assertions.append((kind, target, excerpt))
                if kind not in ("vague_completion",):
                    seen_specific = True
    
                # Multi-target expansion (v1.2+): for file-creation/deletion claims,
                # scan the text immediately after the matched span for chained
                # filenames connected by ", " / " and " / ", and ". This catches
                # "Created auth.py and helpers.py" / "Removed old.py, legacy.py".
                if kind in ("created_file", "created_file_terse", "deleted_file") and target:
                    tail_start = m.end()
                    # Bound the tail at the next sentence boundary so we don't drag
                    # filenames from later sentences into this assertion's scope.
                    sentence_end = _next_sentence_boundary(claim, tail_start)
                    tail = claim[tail_start:sentence_end]
                    for fm in _MULTI_TARGET_FOLLOWUP.finditer(tail):
                        chained_target = fm.group(1)
                        if chained_target == target:
                            continue
                        chained_excerpt = (
                            f"{excerpt} (chained: '{chained_target}')"
                            if len(excerpt) + len(chained_target) + 14 <= 200
                            else excerpt
                        )
                        assertions.append((kind, chained_target, chained_excerpt))
    
        # Dedupe identical (kind, target) pairs while preserving order
        seen: set[tuple[str, str | None]] = set()
        unique: list[tuple[str, str | None, str]] = []
        for a in assertions:
            key = (a[0], a[1])
            if key in seen:
                continue
            seen.add(key)
            unique.append(a)
        return unique
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully discloses behavior: pure function, stateless, returns structured verdict. It details the verdict categories and limitation of snapshot capture, providing complete transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is detailed and informative, but somewhat verbose. It front-loads the purpose but includes extensive examples and context that could be condensed without losing value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 params, nested objects, no output schema), the description is highly complete. It explains the return type, verdict values, and use cases, leaving no major gaps for an agent to understand usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and schema descriptions are present. The description adds valuable context (examples, recognized keys, formats) that goes beyond the schema, aiding parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool compares agent's stated outcome against before/after snapshots to detect misreporting, with specific failure modes and a clear verdict output. It distinguishes effectively from sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool and provides context (pure function, stateless), but does not explicitly state when not to use it or offer direct comparisons to sibling tools, which slightly reduces clarity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/temurkhan13/openclaw-output-vetter-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server