ThumbGate
The MCP Memory Gateway is a context engineering server that captures agent feedback, enforces pre-action gates to block known mistakes, and injects relevant past context into AI coding agent sessions for improved reliability and continuity.
Feedback & Memory
Capture feedback (
capture_feedback,capture_memory_feedback): Record up/down signals with context, reasoning, and rubric scores; vague feedback is rejected with a clarification promptRecall past context (
recall,commerce_recall): Vector-search relevant past feedback, memories, and prevention rules for the current taskView summaries & analytics (
feedback_summary,feedback_stats,dashboard): Approval rate trends, gate enforcement stats, and prevention impact overviewsGenerate prevention rules (
prevention_rules,get_reliability_rules): Auto-generate blocking rules from repeated failure patterns
Pre-Action Gates & Safety
Satisfy gates (
satisfy_gate): Record evidence that a gate condition is met (e.g., PR threads checked) with a 5-minute TTLGate statistics (
gate_stats): See blocked/warned counts and top triggered gates
Session Continuity
Session handoff (
session_handoff): Write a primer capturing git state, last task, next step, and blockersSession primer (
session_primer): Restore context at session start from the most recent handoff
Workflow Planning & Diagnosis
List & plan intents (
list_intents,plan_intent): View available workflows and generate checkpointed execution plans with policy gatesDiagnose failures (
diagnose_failure): Root-cause analysis for failed or suspect workflow stepsBootstrap agents (
bootstrap_internal_agent): Normalize GitHub/Slack/Linear triggers into startup context with recall packs and worktree sandboxesDelegation handoffs (
start_handoff,complete_handoff): Manage sequential agent delegation with verification outcomes
Context Engineering
Context packs (
construct_context_pack,evaluate_context_pack): Build and evaluate bounded context packs for large projects, closing the retrieval learning loopContext provenance (
context_provenance): Audit trail of recent context and retrieval decisionsEstimate uncertainty (
estimate_uncertainty): Bayesian uncertainty estimates for risky tags before acting
Business Metrics
Business metrics (
get_business_metrics): Retrieve Revenue, Conversion, and Customer metrics from the Semantic LayerSemantic entity descriptions (
describe_semantic_entity,describe_reliability_entity): Canonical definitions and state of Customer, Revenue, or Funnel entities
Export & Fine-Tuning
Export DPO pairs (
export_dpo_pairs): Build preference pairs from promoted memories for model fine-tuningExport Databricks bundle (
export_databricks_bundle): Export RLHF logs and proof artifacts as a Databricks-ready analytics bundleGenerate skills (
generate_skill): Auto-generate Claude skill files (SKILL.md) from clustered failure patterns
ThumbGate
Your AI coding bill has a leak.
Stop paying $ for the same AI mistake.
Every retry loop, every hallucinated import, every "let me try a different approach" โ those are billable tokens on every LLM vendor's bill. Thumbs-down once; ThumbGate blocks that exact mistake on every future call. Across Claude Code, Cursor, Codex, Gemini, Amp, Cline, OpenCode โ any MCP-compatible agent, forever.
Under the hood: your thumbs-down becomes one of your Pre-Action Checks that physically blocks the pattern permanently on every future call โ across every session, every model, every agent. It is self-improving agent governance: every correction promotes a fresh prevention rule, and your library of prevention rules grows stronger with every lesson. Works with Claude Code, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode, and any MCP-compatible agent. The monthly Anthropic / OpenAI bill stops paying for the same lesson over and over โ local-first enforcement, zero tokens spent on repeats.
Prevent expensive AI mistakes. Make AI stop repeating mistakes. Turn a smart assistant into a reliable operator.
Mission: make AI coding affordable by making sure you never pay for the same mistake twice.
๐ฌ 90-second demo
Watch the force-push scenario: agent tries to git push --force, one thumbs-down, next session it's blocked โ zero tokens spent on the repeat.
โถ Watch the 90-second demo ยท Script ยท ElevenLabs narration: npm run demo:voiceover
First-dollar activation path
If someone is not already bought into ThumbGate, do not lead with architecture. Lead with one repeated mistake.
Show the pain: open the ThumbGate GPT and paste the bad answer, risky command, deploy, PR action, or agent plan before it runs again.
Capture the lesson: type
thumbs down:orthumbs up:with one concrete sentence. Native ChatGPT rating buttons are not the ThumbGate capture path; typed feedback is.Enforce the repeat: run
npx thumbgate initwhere the agent executes so the lesson can become a Pre-Action Check instead of another reminder.Upgrade only after proof: Solo Pro is for the dashboard, DPO export, proof-ready evidence, and higher capture limits after one real blocked repeat. Team starts with the Workflow Hardening Sprint around one repeated failure, one owner, and one proof review.
The buying question is simple: what repeated AI mistake would be worth blocking before the next tool call?
The Problem โ the bill nobody talks about
Frontier-model calls are not cheap. Sonnet 4.5 is ~$3 / 1M input tokens and ~$15 / 1M output tokens. Opus is 5ร that. Every time your agent:
hallucinates a function name and you have to correct it,
retries the same failing tool call until it gives up,
regenerates a 4,000-token plan you already approved last session,
repeats a destructive command you blocked manually yesterday,
โฆyou are paying for that round-trip. Twice if it retries. Three times if you re-prompt. And the agent has no memory across sessions, so the meter resets every Monday.
Session 1: Agent force-pushes to main. You fix it. +4,200 tokens
Session 2: Agent force-pushes again. You fix it. +4,200 tokens
Session 3: Same mistake. Again. You lose 45m. +5,800 tokensThat's ~$0.21 in tokens just to fix the same mistake three times โ multiplied by every developer, every repeated-mistake class, every week. The math gets ugly fast.
The Solution โ fix it once, the bill never sees it again
Session 1: Agent force-pushes to main. You ๐ it. +4,200 tokens
Session 2: โ Check blocks the force-push. Zero round-trip. +0 tokens
Session 3+: Never happens again. +0 tokensOne thumbs-down. The PreToolUse hook intercepts the call before it reaches the model โ no input tokens, no output tokens, no retry loop. The dashboard tracks tokens saved this week as a live counter so you can see exactly what your prevention rules are worth. Mark a review checkpoint once, and the dashboard narrows the next pass to only the feedback, lessons, and check blocks that landed since your last review.
ThumbGate doesn't make your agent smarter. It makes your agent cheaper to be wrong with.
Quick Start
npx thumbgate init # auto-detects your agent, wires everything
npx thumbgate capture "Never run DROP on production tables"That single command creates a prevention rule. Next time any AI agent tries to run DROP on production:
โ Check blocked: "Never run DROP on production tables"
Pattern: DROP.*production
Verdict: BLOCKArchitecture
ThumbGate operates as a 4-layer enforcement stack between your AI agent and your codebase:
![]()
Layer 1: Feedback Capture
Your thumbs-up/down reactions are captured via MCP protocol, CLI, or the ChatGPT GPT surface. Each reaction is stored as a structured lesson with context, timestamp, and severity.
Layer 2: Check Engine
The check engine converts lessons into enforceable rules using pattern matching, semantic similarity (via LanceDB vectors), and Thompson Sampling for adaptive rule selection. Rules stay in local ThumbGate runtime state.
Layer 3: Pre-Action Interception
Before any agent action executes, ThumbGate's PreToolUse hook intercepts the command and evaluates it against all active checks. This happens at the MCP protocol level โ the agent physically cannot bypass it.
Layer 4: Multi-Agent Distribution
Checks are distributed across all connected agents via MCP stdio protocol. One correction in Claude Code protects Cursor, Codex, Gemini CLI, Cline, and any MCP-compatible agent.
Prompt engineering still matters, but it is only the starting point. ThumbGate adds prompt evaluation on top: proof lanes, benchmarks, and self-heal checks tell you whether your prompt and workflow actually held up under execution instead of leaving you to guess from vibes. Run npx thumbgate eval --from-feedback --write-report=.thumbgate/prompt-eval-proof.md to turn real thumbs-up/down feedback into reusable eval cases and a buyer-ready proof report.
Managed model benchmark lane
When a new managed model drops, do not swap ThumbGate over on vendor claims alone. Rank it against the actual ThumbGate workload first:
npx thumbgate model-candidates --workload=pretool-gating --json
npx thumbgate model-candidates --workload=long-trace-review --provider=openai-compatible --gateway=tinker --jsonThe catalog currently includes the April 23, 2026 Tinker additions:
tinker/qwen3.6-35b-a3bfor pre-action gating, agentic coding, and tool-usetinker/qwen3.6-27bfor the cheap fast-pathtinker/kimi-k2.6-128kfor long-trace review and multi-agent sessions
Each recommendation ships with the benchmark commands to run next: feedback-derived prompt eval, gate-eval, and thumbgate bench. For whole-repo clone claims, add npx thumbgate bench --programbench-smoke to generate a ProgramBench-style cleanroom proof report without claiming an official ProgramBench score. That keeps model selection evidence-backed instead of hype-driven.
![]()
![]()
Install for Your Agent
Agent | Command |
Claude Code |
|
Cursor |
|
VS Code / Open VSX | |
Antigravity-compatible | |
JetBrains | |
Codex |
|
Gemini CLI |
|
Amp |
|
Cline (Roo Code successor) |
|
Claude Desktop | |
Any MCP agent |
|
Works with Claude Code, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode, and any MCP-compatible agent. Migrating from Roo Code (sunsetting 2026-05-15)? See adapters/cline/INSTALL.md.
Status bar proof
Claude renders the live ThumbGate footer today. npx thumbgate init --agent codex now installs the full Codex hook bundle and writes the ThumbGate statusLine target into ~/.codex/config.json so you can test it on your local Codex build immediately.
Install Codex Plugin
Open the Codex plugin install page or download the standalone bundle from GitHub Releases. The Codex launcher resolves thumbgate@latest when MCP and hooks start, so published npm fixes reach active Codex installs without hand-editing ~/.codex/config.toml.
Install page: thumbgate-production.up.railway.app/codex-plugin
Direct zip: thumbgate-codex-plugin.zip
Follow: plugins/codex-profile/INSTALL.md
How It Works
STEP 1 STEP 2 STEP 3
โโโโโโโโ โโโโโโโโ โโโโโโโโ
You react ThumbGate learns The check holds
๐ on a bad โโโบ Feedback becomes โโโบ Next time the
agent action a saved lesson agent tries the
and a block rule same thing:
๐ on a good โโโบ Good pattern gets โ BLOCKED
agent action reinforced (or โ
allowed)No manual rule-writing. No config files. Your reactions teach the agent what your team actually wants.
ThumbGate sells three concrete outcomes:
Prevent expensive AI mistakes โ catch bad commands, destructive database actions, unsafe publishes, and risky API calls before they run.
Make AI stop repeating mistakes โ fix it once, turn the lesson into a rule, and block the repeat before the next tool call lands.
Turn AI into a reliable operator โ move from a smart assistant that apologizes after damage to a production-ready operator with checkpoints, proof, and enforcement.
Measure prompts instead of rewriting them blindly โ use
thumbgate eval --from-feedback, proof lanes, ThumbGate Bench, andself-heal:checkto evaluate whether prompts and workflows actually improved behavior.
Use Cases
Stop force-push to main โ Check blocks
git push --forceon protected branches before it runsPrevent repeated migration failures โ Each mistake becomes a searchable lesson that fires before the next attempt
Block unauthorized file edits โ Control which files agents can touch with path-based rules
Memory across sessions โ The agent remembers your feedback from yesterday
Shared team safety โ One developer's thumbs-down protects the whole team
Auto-improving without feedback โ Self-improvement mode evaluates outcomes and generates rules automatically
Built-in Checks
โ force-push โ blocks git push --force
โ protected-branch โ blocks direct push to main
โ unresolved-threads โ blocks push with open reviews
โ package-lock-reset โ blocks destructive lock edits
โ env-file-edit โ blocks .env secret exposure
+ custom prevention rules for project-specific failuresCLI Reference
npx thumbgate init # detect agent, wire hooks
npx thumbgate doctor # health check
npx thumbgate capture # create a check from text
npx thumbgate lessons # see what's been learned
npx thumbgate explore # terminal explorer for lessons, checks, stats
npx thumbgate background-governance # review background-agent run risk
npx thumbgate model-candidates --workload=dashboard-analysis --provider=openai --json # evaluate GPT-5.5 routing
npx thumbgate native-messaging-audit # inspect local browser bridges and extension hosts
npx thumbgate dashboard # open local dashboard
npx thumbgate serve # start MCP server on stdio
npx thumbgate bench # run reliability benchmark
npx thumbgate bench --programbench-smoke # include cleanroom whole-repo proof lanePricing
Free | Pro ($19/mo) | Team ($49/seat/mo) | |
Local CLI + enforced checks | โ | โ | โ |
Feedback captures (lifetime) | 3 | Unlimited | Unlimited |
Auto-promoted prevention rules | 1 | Unlimited | Unlimited |
MCP agent integrations | All | All | All |
Personal dashboard | โ | โ | โ |
DPO export (model fine-tuning) | โ | โ | โ |
Team lesson export/import | โ | โ | โ |
Shared hosted lesson DB | โ | โ | โ |
Org-wide dashboard | โ | โ | โ |
Approval + audit proof | โ | โ | โ |
The free tier gives you unlimited feedback captures and up to 5 active auto-promoted prevention rules โ generous enough to make ThumbGate part of your daily flow. MCP integrations for all agents (Claude Code, Cursor, Codex, Gemini, Amp, Cline, OpenCode) ship free.
Pro ($19/mo or $149/yr) removes the rule cap and adds history-aware lesson recall, lesson search, DPO export, and a personal dashboard. Team ($49/seat/mo) adds a shared hosted lesson DB, org dashboard, and shared enforcement across the org. Pro and Team include open_feedback_session, append_feedback_context, and finalize_feedback_session for structured multi-turn feedback capture.
Best first paid motion for teams: the Workflow Hardening Sprint โ qualify one repeated failure before committing to a full rollout. Start intake โ
Best first technical motion: install the CLI-first and let init wire hooks for the agent you already use.
Paid path for individual operators: ThumbGate Pro is the self-serve side lane for a personal dashboard and export-ready evidence.
Start free ยท See Pro ยท Team Sprint intake
Team Lesson Sharing (Pro + Team)
One team's hard-won lessons shouldn't stay trapped on one laptop. ThumbGate Pro and Team can export lessons as portable bundles and import them into any other ThumbGate instance โ so a mistake caught by Team A becomes a prevention rule for Team B.
Export lessons from one project:
curl -X POST http://localhost:3456/v1/lessons/export \
-H "Authorization: Bearer $THUMBGATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"outputPath": "./lessons-export.json"}'Filter by signal or tags:
curl -X POST http://localhost:3456/v1/lessons/export \
-H "Authorization: Bearer $THUMBGATE_API_KEY" \
-H "Content-Type: application/json" \
-d '{"signal": "down", "tags": ["push-notifications", "ci"]}'Import into another team's ThumbGate:
curl -X POST http://localhost:3456/v1/lessons/import \
-H "Authorization: Bearer $THUMBGATE_API_KEY" \
-H "Content-Type: application/json" \
-d @lessons-export.jsonWhat happens on import:
Deduplication โ lessons with the same ID or title+signal are skipped
Provenance tracking โ every imported lesson is tagged
team-importwith original source project, export timestamp, and original IDNo overwrite โ import is additive; existing lessons are never modified
The export bundle includes full lesson metadata: signal, title, context, tags, failure type, skill, structured rules, and diagnosis. It's the same data you see in the lesson detail dashboard โ portable as JSON.
Use cases:
Share enforcement patterns across repos in the same org
Onboard a new team with pre-built lessons from a mature project
Export lessons before a project handoff so institutional knowledge transfers
Feed lessons from multiple teams into a centralized DPO training pipeline
DPO Export for Fine-Tuning (Pro + Team)
Every thumbs-up and thumbs-down becomes a training signal. ThumbGate Pro exports your captured feedback as DPO (Direct Preference Optimization) pairs โ ready to feed into a LoRA fine-tune so your model stops repeating known mistakes at the weight level, not just the check level.
Export DPO pairs:
curl -X POST http://localhost:3456/v1/dpo/export \
-H "Authorization: Bearer $THUMBGATE_API_KEY" \
-o dpo-pairs.jsonlWhat you get: JSONL where each line is a preference pair:
chosenโ the agent action you thumbed uprejectedโ the action you thumbed down for the same task contextpromptโ the originating user intent
Use cases:
Fine-tune Llama 3 / Mistral / local models with a LoRA adapter trained on your real mistakes
Feed into RLAIF or KTO pipelines (KTO export also available via
/v1/kto/export)Build a model that natively avoids your team's known failure patterns โ no check at inference time needed
Why this matters: Checks block mistakes. Fine-tuning prevents them from being attempted. Combine both for belt-and-suspenders governance.
Tech Stack
Layer | Technology |
Storage | SQLite + FTS5, LanceDB vectors, JSONL logs |
Capture | Unlimited feedback captures (free + Pro) |
Intelligence | MemAlign dual recall, Thompson Sampling |
Enforcement | PreToolUse hook engine, Checks config |
Interfaces | MCP stdio, HTTP API, CLI (Node.js >=18) |
Billing | Stripe |
Execution | Railway, Cloudflare Workers, Docker Sandboxes |
Governance | Workflow Sentinel, control plane, Docker Sandboxes |
Every Changeset is tied to the exact main merge commit and generates Verification Evidence for Release Confidence.
Popular buyer questions: AI search topical presence ยท Relational knowledge and AI recommendations ยท Background agent governance ยท GPT-5.5 model evaluation ยท Stop repeated AI agent mistakes ยท Browser automation safety ยท Native messaging host security ยท Autoresearch agent safety ยท Cursor guardrails ยท Codex CLI guardrails ยท Gemini CLI memory + enforcement ยท Google Cloud MCP guardrails ยท Roo Code alternative: migrate to Cline
Workflow Hardening Sprint ยท Live Dashboard
Integrations
Open ThumbGate GPT โ ThumbGate GPT: start here. Paste agent actions, get advice + checkpointing. No, users do not have to keep chatting inside the ThumbGate GPT to use ThumbGate โ the hard enforcement layer still runs where the work happens.
Claude Desktop Extension โ One-click install for Claude Desktop
Codex Plugin โ Auto-updating standalone bundle and install page for Codex CLI
VS Code / Open VSX Extension โ Marketplace-ready MCP provider and
.vscode/mcp.jsonfallback for VS Code-compatible IDEsAntigravity-compatible VSIX โ Open VSX/direct VSIX install path while Antigravity-specific marketplace support is still unproven
JetBrains Plugin Scaffold โ IntelliJ/PyCharm Marketplace path for the same
thumbgate@latestruntimePerplexity Command Center โ AI-search visibility + lead discovery
ThumbGate Bench โ Reliability benchmark and ProgramBench-style cleanroom proof lane
Manus AI Skill โ ThumbGate integration for Manus AI agents
Feedback Sessions
Give the agent more context when a thumbs-down isn't enough:
๐ thumbs down
โโโบ open_feedback_session
โโโบ "you lied about deployment" (append_feedback_context)
โโโบ "tests were actually failing" (append_feedback_context)
โโโบ finalize_feedback_session
โโโบ lesson inferred from full conversationFree and self-hosted users can invoke search_lessons directly through MCP, and via the CLI with npx thumbgate lessons. History-aware feedback sessions give the agent full context for each lesson.
FAQ
Is ThumbGate a model fine-tuning tool? No. ThumbGate does not update model weights. It captures feedback, stores lessons, injects context at runtime, and blocks bad actions before they execute.
How is this different from CLAUDE.md or .cursorrules? Those are suggestions the agent can ignore. ThumbGate checks are enforced โ they physically block the action before it runs. They also auto-generate from feedback instead of requiring manual writing.
Does it work with my agent? If it supports MCP or pre-action hooks, yes. Claude Code, Claude Desktop, Cursor, Codex, Gemini CLI, Amp, Cline, OpenCode all work out of the box.
Is it free? The free tier gives you unlimited feedback captures and up to 5 active auto-promoted prevention rules โ generous enough for solo devs to use daily. MCP integrations ship free for every agent.
Pro ($19/mo or $149/yr) removes the rule cap and adds history-aware lesson recall, lesson search, and a personal dashboard. Team ($49/seat/mo) adds a shared hosted lesson DB, org dashboard, and shared enforcement.
Docs
First Dollar Playbook โ turning one painful workflow into the next booked pilot
Commercial Truth โ pricing, claims, what we don't say
Changeset Strategy โ release notes and version bump enforcement
Release Confidence โ changesets, version checks, proof lanes
Verification Evidence โ proof artifacts
Agent Workflow Contract โ the agent-run contract for all ThumbGate operations
Ready for Agent Intake โ ready-for-agent intake template
ThumbGate-Core โ private core for hosted overlays, ranking, policy synthesis, billing intelligence, and org/team workflows
License
MIT. See LICENSE.
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/IgorGanapolsky/ThumbGate'
If you have feedback or need assistance with the MCP directory API, please join our Discord server