Skip to main content
Glama
Armada735

verify-action

verify-action-mcp

When your AI agent says "I deleted user 12345" but the row count didn't change — this catches it. A small third-party verification service for AI agent tool-call evidence: submit (claim, evidence), get back a verdict and an HMAC-attested receipt.

License: MIT Tests

🇯🇵 日本語版は ↓ ページ後半 を参照してください。


English

Why

AI agents commonly assert success when reality didn't match:

  • "I deleted user 12345" — but the row count didn't actually change.

  • "I added a null check" — but the diff also rewrote 5 unrelated functions.

  • "I sent the welcome email to alice@example.com" — but the request body actually targeted bob@example.com.

These silent successes don't show up in benchmarks (which score "did the model say it succeeded?"). They surface when something downstream breaks — sometimes hours or days later.

verify-action-mcp is a small third-party that catches that drift before downstream tools commit to it. It's a post-action evidence verifier — the receipt proves what was checked, not what is true. Existing pre-action policy admission control products from major vendors operate on a different lane; this one runs after the agent has done the work, with the artifacts.

Quick start

MCP (Claude Code, Cursor, Cline, Codex, etc.)

// claude_desktop_config.json or your harness's MCP config
{
  "mcpServers": {
    "verify-action": {
      "transport": {"type": "http", "url": "https://verify.armadalab.dev/mcp"}
    }
  }
}

The agent now has a verify_action tool available. It can self-call before reporting completion, or you can invoke it from your harness logic.

REST

curl -X POST https://verify.armadalab.dev/verify \
  -H 'Content-Type: application/json' \
  -d '{
    "claim": "Deleted user 12345",
    "evidence": {
      "before_count": 100,
      "after_count": 99,
      "operation": "DELETE FROM users WHERE id=12345",
      "affected_rows": 1
    }
  }'

Response (receipt truncated; full shape below):

{
  "verdict": "ok",
  "aar_verdict": "verified",
  "reasoning": "Row count decreased by exactly 1; SQL operation matches DELETE semantics; user id matches claim.",
  "confidence": 0.92,
  "verifier_used": "db_op_v1",
  "kind_dispatched": "db_op",
  "receipt": {
    "schema": "verify_action_receipt.v0",
    "verdict": "verified",
    "claim_hash": "sha256:<64-hex>",
    "evidence_manifest_hash": "sha256:<64-hex>",
    "kid": "v0-default",
    "issued_by": "aar:reference-impl@v0",
    "signature": "hmac-sha256:<base64>",
    "_full": "(see Receipts section)"
  }
}

Self-host

git clone https://github.com/Armada735/verify-action-mcp
cd verify-action-mcp
./start.sh   # binds 127.0.0.1:8092
./stop.sh

Pure Python stdlib. No pip install. Tested on Linux.

What it verifies

A dispatcher routes by kind (or auto-infers from evidence shape):

Kind

Evidence shape

Critical signal that forces mismatch

code_diff

{diff: "<unified diff>"}

All claimed paths absent from diff

db_op

{before_count, after_count, operation, affected_rows}

Claim ID not in SQL ID

file_op

{path, exists_before, exists_after, line_count?, size_bytes?}

Numeric divergence > 50% or > 50 absolute

api_call

{request, response_status, response_body}

Email target mismatch (claim ↔ request body)

generic

any object

(conservative; usually returns insufficient_evidence)

Each verifier looks at:

  • Verb in claim ↔ direction of state change (delete = -1, insert = +1, update = 0)

  • Specific identifiers / paths / emails / URLs

  • Counts / line counts / sizes

  • HTTP status semantics

  • "Critical signals" that force mismatch regardless of pos/neg balance

Verdicts (dual format)

Field

Values

Notes

aar_verdict

verified / contradicted / insufficient_evidence / unsafe_to_verify

4-value canonical (verify_action_receipt.v0)

verdict

ok / mismatch / uncertain

3-value legacy alias for backwards compatibility

unsafe_to_verify is returned when the verifier itself raised an exception (cannot examine evidence) — distinct from insufficient_evidence (evidence examined, ambiguous).

Receipts (verify_action_receipt.v0)

Every /verify call also issues an HMAC-SHA256-attested receipt as a nested receipt field. Full shape:

Field

Type

Description

schema

string

"verify_action_receipt.v0"

kid

string

Key id; v0 ships with "v0-default". Operators rotate keys with fresh kids.

issued_by

string

Issuer identifier (this reference impl: "aar:reference-impl@v0")

issued_at

string

RFC 3339 UTC timestamp

verifier_id

string

"verify-action-mcp@<version>"

verifier_method

string

"rule_based.<kind>" (e.g. rule_based.db_op)

claim_hash

string

"sha256:<64-hex>" — content-addressed; raw claim is not stored

evidence_manifest_hash

string

"sha256:<64-hex>" — same

verdict

string

One of the 4 aar_verdict values

confidence

number

0–1

reason_codes

array of strings

Free-form diagnostic codes (v0 unrestricted)

policy_or_oracle_refs

array of strings

Optional refs to policy / oracle inputs (usually [])

caller_context

object

Optional caller_context echoed back (max 8 keys, 64-char strings)

signature

string

"hmac-sha256:<base64-no-padding>"

What the receipt asserts: that this specific service issued this specific verdict for this content-addressed (claim, evidence) pair at this time, signed under a known key id (kid).

What the receipt does NOT assert: factual truth of the claim, legal admissibility in any forum, or warranty of any kind.

Trust model in v0: HMAC is symmetric — the receipt verifies that a private key under our control signed it. It is not a third-party attestation in the cryptographic sense. Treat v0 receipts as a content-addressed log entry from this service. Schema upgrade path for v1 (asymmetric ed25519, multi-issuer) is documented in aar/SCHEMA_UPGRADES.md.

API

Method

Path

Purpose

GET

/ /about

Project description (HTML)

GET

/healthcheck

Liveness probe

GET

/spec

Tool schema + verifier kinds (JSON)

GET

/stats

Aggregate counters since process start

GET

/privacy

Privacy notice (HTML)

GET

/tos

Terms of service (HTML)

POST

/verify

REST: {claim, evidence, kind?, context?, caller_context?} → verdict + receipt

POST

/mcp

MCP JSON-RPC 2.0 endpoint

MCP methods

  • initialize{protocolVersion: "2024-11-05", capabilities: {tools: {}}, serverInfo: {name, version}}

  • tools/list{tools: [{name: "verify_action", description, inputSchema}]}

  • tools/call (name=verify_action) → {content: [...], isError, _structured_result: {verdict, aar_verdict, reasoning, confidence, receipt, ...}}

  • notifications/initialized, ping → empty result

Examples

code_diff — coherent

curl -X POST https://verify.armadalab.dev/verify -H 'Content-Type: application/json' -d '{
  "claim": "Added null check for user.email in src/user.py",
  "evidence": {
    "diff": "--- a/src/user.py\n+++ b/src/user.py\n@@ -10,3 +10,5 @@\n def get_email(user):\n+    if user.email is None:\n+        return None\n     return user.email"
  }
}'
# → aar_verdict: verified (legacy: ok), confidence ~0.9

file_op — line count mismatch

curl -X POST https://verify.armadalab.dev/verify -H 'Content-Type: application/json' -d '{
  "claim": "Created /tmp/output.txt with 200 lines",
  "evidence": {"path":"/tmp/output.txt","exists_before":false,"exists_after":true,"line_count":50}
}'
# → aar_verdict: contradicted (legacy: mismatch) — claim said 200 lines, evidence says 50

api_call — target email mismatch (critical signal)

curl -X POST https://verify.armadalab.dev/verify -H 'Content-Type: application/json' -d '{
  "claim": "Sent welcome email to alice@example.com",
  "evidence": {
    "request": {"to":"bob@example.com","subject":"Welcome!"},
    "response_status": 200, "response_body": "{\"sent\":true}"
  }
}'
# → aar_verdict: contradicted — target email differs from claim

Privacy

  • IP addresses are SHA-256-hashed with a salt (rotates per server install). Plaintext IPs are never persisted.

  • Submitted claims and evidence are written to private trace logs marked untrusted_payload. Aggregate findings may be published; individual traces stay private.

  • 30-day log retention is enforced by the included purge_old_logs.sh script (operator installs as a daily cron — see monitor/CRON.md for the entry).

  • A PII guard rejects payloads containing JP My Number-shape (12-digit) sequences, passport-shape strings, or credit-card-shape digits (with Luhn check). Detection is structural — the guard does NOT confirm any number is a real personal identifier.

  • traces/ is chmod 600.

See /privacy and /tos for the user-facing notice.

Phase 1 limitations

  • Rule-based only — no LLM-as-judge. The 4 specialized verifiers handle their kinds well; the generic axis is conservative (often returns insufficient_evidence).

  • No sub-claim decomposition — 1 claim → 1 verifier.

  • No cross-trace correlation — each call is independent.

  • HMAC-attested receipts only — symmetric, single-issuer. Asymmetric / multi-issuer path documented in aar/SCHEMA_UPGRADES.md.

  • No SLA, no rate-limit guarantee, no uptime promise on the hosted endpoint. Self-host (above) for stability.

Who this is for / not for

For:

  • Agent harness developers wanting a quick post-action sanity check

  • Multi-agent pipeline operators wanting an integrity boundary between steps

  • Anyone evaluating "did this agent do what it said it did?" patterns

Not for:

  • Security-critical attestation (HMAC v0 is not third-party-strong; wait for v1 ed25519)

  • High-throughput production with strict SLA (run self-hosted, expect to maintain it)

  • Domain-specific reasoning the rule-based verifiers don't cover (extend by writing a custom verifier kind under verifiers/)

Roadmap

  • Schema v1: ed25519 + multi-issuer (aar/SCHEMA_UPGRADES.md)

  • LLM-augmented generic verifier (opt-in)

  • Sub-claim decomposition for multi-step actions

  • Cumulative observation API ("this harness mismatches on file_op X% of the time")

  • Custom verifier registration

This is a 90-day probe. If meaningful adoption appears, v1 schema work begins.

License

MIT — see LICENSE.

Contact

Maintained by Armada (@Ardev_lab). Issues / questions: GitHub Issues, or hello@armadalab.dev.


日本語

これは何

AI エージェントが「user 12345 を削除しました」と言うのに DB の行数が変わってない — そういう silent な不整合を捉える、小さい第三者検証 service です。

エージェントから (claim, evidence) を受け取って、整合判定 (verdict) と HMAC 署名付き受領証 (verify_action_receipt.v0) を返します。

想定する失敗パターン(一般論として)

  • 「user 12345 を削除しました」と言うが、DB の行数は変わってない

  • 「null チェックを追加した」と言うが、diff には無関係な 5 関数の rewrite が混ざってる

  • alice@example.com に welcome メールを送った」と言うが、実際の request body は bob@example.com

ベンチマークは「モデルが成功と言ったか」を見ますが、「実際の状態が claim と整合的に更新されたか」は別軸の問題です。後者は agent 運用上の重要な観点の一つです。

verify-action-mcp は、その差分を downstream のツールが confirm する前に 捉える層を担います。既存の pre-action 許可制御(policy admission control / ツール呼び出し前の許可)とは独立した、post-action 証拠検証 という別レイヤを提供します。

業界標準を主張せず、reference implementation として位置づけます。receipt schema (verify_action_receipt.v0) は fork できる程度に小さく設計しています。

使い方

MCP(Claude Code / Cursor / Cline / Codex 等)

{
  "mcpServers": {
    "verify-action": {
      "transport": {"type": "http", "url": "https://verify.armadalab.dev/mcp"}
    }
  }
}

これでエージェントの tools 一覧に verify_action が現れます。エージェントが完了報告の直前に self-call するパターンを想定しています。

REST

curl -X POST https://verify.armadalab.dev/verify -H 'Content-Type: application/json' -d '{
  "claim": "user 12345 を削除しました",
  "evidence": {
    "before_count": 100, "after_count": 99,
    "operation": "DELETE FROM users WHERE id=12345",
    "affected_rows": 1
  }
}'

応答(抜粋。receipt の完全形は下の Receipt 節参照):

{
  "verdict": "ok",
  "aar_verdict": "verified",
  "reasoning": "Row count decreased by exactly 1; SQL operation matches DELETE semantics; user id matches claim.",
  "confidence": 0.92,
  "receipt": { "schema": "verify_action_receipt.v0", "...": "..." }
}

4 値判定 (aar_verdict)

意味

verified

claim と evidence が整合

contradicted

claim と evidence に決定的な不一致あり

insufficient_evidence

evidence は examined されたが判定材料が足りない

unsafe_to_verify

verifier が例外で evidence を examine できなかった

旧 3 値 (ok / mismatch / uncertain) も verdict フィールドで返るため、既存 client の互換性は維持されます。

Receipt(HMAC 署名付き受領証)

/verify の応答には署名された verify_action_receipt.v0 受領証が receipt ネスト下で返ります。主な field:

field

内容

schema

"verify_action_receipt.v0"

kid

鍵 id(v0 default は "v0-default"、operator は rotation 時に新しい kid を発行)

issued_by

発行者識別子(reference impl は "aar:reference-impl@v0"

issued_at

RFC 3339 UTC タイムスタンプ

verifier_id

"verify-action-mcp@<version>"

verifier_method

"rule_based.<kind>"(例: rule_based.db_op

claim_hash

"sha256:<64-hex>" — claim 本文は保存しない

evidence_manifest_hash

"sha256:<64-hex>" — evidence 本文は保存しない

verdict

4 値のいずれか

confidence

0..1

reason_codes

自由形式の診断コード配列

signature

"hmac-sha256:<base64>"

receipt の意味: 「このインスタンスが、この時刻に、この (claim, evidence) ペア(hash 参照)に対して、この verdict を発行した」だけです。claim 自体の真実性、いかなる法的手続における証拠能力(admissibility)、品質保証も主張するものではありません。

v0 の trust model: HMAC は対称鍵のため、receipt は「当 service が(既知の private 鍵で)署名した」ことしか証明しません。第三者証明としての強度は v1(ed25519 + multi-issuer)以降で達成予定です。schema 拡張 path は aar/SCHEMA_UPGRADES.md を参照。

Privacy

  • IP は SHA-256 + salt で 16 文字に hash 化(生 IP は保存しない) ※ハッシュ化済 IP からは特定の個人を識別しません。

  • claim / evidence は private trace ログに untrusted_payload として記録、集計指標のみ公表します

  • 30 日でログ自動削除(purge_old_logs.sh を operator が daily cron として運用)

  • マイナンバー等の特定個人情報らしき桁数列、passport-shape 文字列、credit-card-shape の数字(Luhn check 込)を含む payload は受領証発行を停止します(検出は形式のみで、番号確定をするものではありません)。

  • traces/chmod 600

詳細は /privacy /tos 参照。

現時点の制約

  • stdlib only / rule-based: LLM-as-judge は不実装。generic 軸は意図的に弱め

  • sub-claim 分解なし: 1 claim → 1 verifier

  • cross-trace correlation なし: 各 call は独立判定

  • HMAC(対称鍵)のみ: 多発行体対応 / asymmetric は v1 で(aar/SCHEMA_UPGRADES.md

  • hosted endpoint に SLA / uptime / rate-limit の保証はありません: 安定性が必要なら self-host を推奨

想定読者

  • agent harness 開発者で、完了報告前の sanity check を仕込みたい人

  • multi-agent pipeline 運用者で、ステップ間に integrity boundary を置きたい人

  • 「agent が言ったとおりに本当にやったか」を継続観察したい人

ロードマップ

90 日 probe として運用、事前に commit した kill criteria に基づいて継続 / 縮小 / 撤退を判断します。adoption が現れたら schema v1(ed25519 + multi-issuer)から着手。

ライセンス・連絡先

  • License: MITLICENSE 参照)

  • 維持者: Armada (@Ardev_lab)

  • Issue / 質問: GitHub Issues または hello@armadalab.dev

  • ※現時点では無料で提供しています(将来の有料化についてはアナウンス予定)

A
license - permissive license
-
quality - not tested
C
maintenance

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Armada735/verify-action-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server