ControlKeel
ControlKeel is a governance and control plane for AI agent-led software delivery, providing deterministic validation, sandboxed execution, cost control, memory, review gates, and observability across AI coding hosts.
Validation & Execution
ck_validate: Check code, config, shell commands, or text against governance rules (trust boundaries, domain packs) before executionck_execute_code: Run generated JavaScript or Python in a Docker sandbox with network/filesystem restrictions and dry-run support
Context & Files
ck_context/ck_context_pack: Fetch mission state, findings, budget, proof summaries, and build compact context bundles for agentsck_fs_ls,ck_fs_read,ck_fs_find,ck_fs_grep: Read-only browsing and searching of the bound project root
Git Integration
ck_git_diff: Generate diffs with CK validation appliedck_git_commit: Validate commit messages before committingck_git_status: Get git status correlated with findings
Governance & Review
ck_finding: Persist findings with severity and ruling (allow/warn/block/escalate)ck_review_submit/ck_review_status/ck_review_feedback: Submit plans or diffs for human review, check status, and approve/denyck_regression_result: Ingest external regression test evidence into proof bundles
Memory & Goals
ck_memory_search/ck_memory_record/ck_memory_archive: Store, retrieve, and archive typed governed memory (decisions, findings, proofs)ck_goal: Record, list, and update durable goals across sessions
Budget & Cost Control
ck_budget: Estimate/commit costs against session and daily budgets with circuit breakersck_cost_optimizer: Get cost optimization suggestions or compare agent pricingck_token_audit: Audit rule files and skills for token bloat and duplicates
Routing & Delegation
ck_route: Recommend the best AI agent for a task based on security tier, budget, and task typeck_delegate: Hand off governed tasks to another agent in auto, embedded, handoff, or runtime mode
Deployment & Observability
ck_deployment_advisor: Analyze project stack, suggest platforms, and generate CI/CD or Docker config filesck_outcome_tracker: Record session outcomes and retrieve leaderboard data for reinforcement learningck_mcp_discover: Auto-discover tools from external MCP servers
Integrates with Amp as a plugin-native host for governing agent-generated software delivery, providing MCP configuration and companion files.
ControlKeel
Turn the way your team works into enforceable memory for AI agents. - @arya_minus
ControlKeel is an agent control plane for day-to-day governed engineering. Through observation, findings and evaluation, it learns your intent rules, review taste and delivery habits, turning them into typed memory, policy checks and proof bundles. CK sits between your coding agents and production as a portable "company brain": comparing intended delivery against actual delivery and turning raw agent intent into policy-validated tasks.
If you're using an AI agent today, you probably have an *.md telling it how to behave. But a rules/specs file is just a promise made to the model. ControlKeel enforces the output. Beyond just catching bugs, CK solves the "Unknown Unknowns" problem: having to re-explain your domain knowledge in every single session.
Product loop
Capture intent and policy — scope, risk, budget, domain pack, and human taste become CK state.
Validate agent output — deterministic checks and optional advisory review produce findings before risky work reaches main.
Gate only when needed — humans approve high-impact actions when intent, risk, or policy requires it.
Persist evidence — findings, reviews, proofs, memory, cost, and task outcomes survive host switches.
Improve with evals — traces and recurring failures become bounded regression evidence for specific suites and subjects.
ControlKeel transforms your domain knowledge from "raw" intent and "shelfware" documentation into a living system that remembers, enforces, and evolves.
Related MCP server: dingdawg-governance
Quick start
One-line setup via your agent
Copy/paste this into your agent (OpenCode, Codex, Claude, or another supported host):
Set up ControlKeel for this repository. Read and follow https://raw.githubusercontent.com/aryaminus/controlkeel/main/README.md, https://raw.githubusercontent.com/aryaminus/controlkeel/main/docs/getting-started.md, https://raw.githubusercontent.com/aryaminus/controlkeel/main/docs/support-matrix.md, and https://raw.githubusercontent.com/aryaminus/controlkeel/main/docs/agent-integrations.md. Install ControlKeel if missing, run `controlkeel setup`, detect this agent host, attach the strongest supported path with `controlkeel attach <host>`, then run `controlkeel attach doctor`, `controlkeel provider doctor`, `controlkeel status`, `controlkeel findings`, and the host-native MCP check. If CK is available only as MCP, call `ck_attach` for this host. Apply only safe local fixes and redact secrets from logs. Pause and ask before continuing if the host needs workspace trust, manual provider configuration, a restart after attach/plugin changes, or a plan-review approval that cannot auto-wait. Ensure the project is trusted and restart the host after attach/plugin changes.CLI install
Install the CLI:
brew tap aryaminus/controlkeel && brew install controlkeel
# or
npm i -g @aryaminus/controlkeel
# or
curl -fsSL https://github.com/aryaminus/controlkeel/releases/latest/download/install.sh | shWindows PowerShell:
irm https://github.com/aryaminus/controlkeel/releases/latest/download/install.ps1 | iexFirst governed run:
controlkeel
controlkeel setup
controlkeel attach opencode # or another supported host
controlkeel attach doctor
controlkeel provider doctor
controlkeel status
controlkeel findingsFor the complete first-run path, use docs/getting-started.md. For host truth, use docs/support-matrix.md and docs/agent-integrations.md.
Benchmark-backed evidence
ControlKeel includes a persisted benchmark engine. Current user-facing evidence is bounded to the named suite, subject, and scoring definition below; docs/benchmarks.md is the canonical reference for full tables, caveats, JSON exports, and agent-host protocols.
Verified with-vs-without-CK baseline (host_comparison_v1, 12 risky scenarios)
Verified with ControlKeel 0.3.45:
Risky suite
host_comparison_v1:ungoverned_baselinecaught 0/12;controlkeel_validatecaught 12/12, blocked 9/12, and hit expected rules 9/12 with median deterministic validation time 52 ms, 0 provider tokens.Paired benign suite
benign_baseline_v1:controlkeel_validateproduced 0/10 catches, 0/10 blocks, FPR 0.000, median deterministic validation time 42 ms, 0 provider tokens.
Read the numbers precisely: deterministic scanner evidence is not the same as model-backed agent-host evidence. Reproduction commands and the OpenCode/Copilot/Claude/Codex comparison protocol live in docs/benchmarks.md.
What ships today
Local governance: CLI, stdio MCP, project binding, host attach/export bundles, scanner validation, findings, reviews, proof bundles, budgets, and typed memory.
Host and runtime support: native attach for supported hosts, runtime exports for headless/outer-loop systems, hosted MCP/minimal A2A, and fallback validation/proxy paths.
Team/project operations: org membership, invitations, OIDC/SAML auth surfaces, workspace GitHub repo bindings, service accounts, webhooks, workspace tool policy, and policy-set APIs.
Cloud evidence paths: opt-in cloud telemetry, workspace keys, cloud run packages, runtime callbacks, and dormant-until-configured bidirectional sync for findings, reviews, digests, and memory records.
Observability loop: timelines, memory quality, costs, trends, problem clusters, eval candidates, benchmark drafts/history, and promotion advisories.
Docs map
docs/README.md — documentation map by job
docs/getting-started.md — install to first finding
docs/support-matrix.md — canonical host/protocol inventory
docs/agent-integrations.md — integration mechanisms and support tiers
docs/benchmarks.md — benchmark scoring, metadata, and claim discipline
docs/observability-feedback-loop.md — local evidence-to-regression loop
docs/control-plane-claim-matrix.md — README claim-to-test matrix for governance, memory, cloud sync, and human gates
docs/api-reference.md and docs/cli-reference.md — code-aligned surfaces
docs/packages.md — package and distribution catalog
docs/self-hosting.md — self-host deployment guidance
Development
mix setup
mix phx.server
mix test
mix precommitPhoenix + Ecto on SQLite. Uses Req for HTTP. Single-binary builds ship through Burrito and GitHub Releases.
Maintenance
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/aryaminus/controlkeel'
If you have feedback or need assistance with the MCP directory API, please join our Discord server