claude-engram
Claude Engram is an MCP server providing persistent memory, session intelligence, and code quality assurance for AI coding assistants.
Memory Management — Store, search, recall, archive, and deduplicate memories (discoveries, rules, mistakes, notes) with hybrid semantic/keyword search, LLM-powered clustering/consolidation, tiered hot/archive storage, and batch operations. Rules and mistakes never archive.
Session Lifecycle — Load full context at session start (memories, checkpoints, decisions, deduplication) and generate summaries at session end.
Work Tracking — Log decisions (with reasoning and alternatives) and mistakes (with affected files and prevention strategies).
Pre-Edit Safety — Before editing, check past mistakes for a file, detect edit loop risk, and verify scope compliance.
Scope Guard — Declare task scope with allowed files/globs, check file inclusion, expand scope with justification, and track violations.
Loop Detection — Record edits and test results; warn when a file has been edited 3+ times without progress.
Context Protection — Save/restore task checkpoints, verify task completion with evidence, register critical instructions, and create session handoff documents.
Convention Tracking — Add, retrieve, and check code/filenames against project conventions (naming, architecture, style, patterns, avoidances).
Output Validation — Validate generated code for fake/silent failures and check command output against expected formats.
Codebase Intelligence
scout_search— semantic search across codebasescout_analyze— LLM-based code analysisfile_summarize— quick or detailed file summariesdeps_map— map imports and reverse dependenciesimpact_analyze— change impact with risk levelcode_quality_check— detect long functions, deep nesting, vague namescode_pattern_check— check code against stored conventionsaudit_batch— batch audit files with severity filteringfind_similar_issues— regex-based bug pattern search
Session Mining — Search past conversations, find past decisions, replay file histories, detect recurring struggles and error patterns, correlate co-edited files, generate project timelines, predict needed context before edits, and reflect with LLM-powered analysis across sessions or projects.
Health Check — Query server status, model, and memory statistics.
Claude Engram
Persistent memory and session intelligence for AI coding assistants. Hooks into Claude Code's lifecycle to auto-track mistakes, decisions, and context — then mines your full session history to surface patterns, predict what you'll need, and search across everything you've ever discussed.
Zero manual effort. Works with any MCP-compatible client.
What It Does
Automatic (hooks — zero invocation):
Tracks every edit, error, test result, and session event
Auto-captures decisions from your prompts ("let's use X", "switch to Y")
Injects the 3 most relevant memories before every file edit
Orients before file reads: code-index summary (what the module is, who imports it) + that file's past mistakes, once per file per session
Warns when you're about to repeat a past mistake
Error deja-vu: when a failure matches a known recurring error, the past fix is injected inline at failure time ("you hit this in 3 sessions — fix was X")
Surfaces the project's known-good test commands at session start (tracked from runs that actually passed)
Opt-in lessons bridge: dated entries in your curated notes files sync as protected memories that resurface when you edit related code (set
lessons_globsin~/.claude_engram/config.json, e.g.["docs/lessons/*.md"]— no default path)TDD-aware error capture: failing test runs are tracked as test results, not logged as mistakes — deliberate RED-phase failures stop polluting the mistake store
Detects edit loops (same file 3+ times without progress) — tracked in per-session hook state, so two concurrent sessions never cross-contaminate
Survives context compaction — checkpoints before, re-injects after
Mines your session history in the background after every session — and live during it: debounced ticks at turn end keep search, extractions, and code indexes fresh mid-session
Verifies imports in proposed edits against a per-project code index (AST, no LLM) —
<engram-precheck>with closest-name suggestionsShows blast radius before editing a shared module — lists its importers (
<engram-blast-radius>)Measures injection precision — tracks which injected context precedes passing tests (view via
session_mine(reflect)) — and feeds it back: a bounded per-kind multiplier (0.8-1.2) tunes how eagerly memories inject
Session Mining (automatic, background):
Parses Claude Code's full conversation logs (JSONL) after every session — including subagent conversations (Explore, Plan, code-reviewer, etc.)
Extracts decisions, mistakes, approaches, and user corrections using structural analysis + semantic scoring (typo-tolerant)
Builds a searchable index across all past conversations (20k+ chunks with subagents)
Detects recurring struggles, error patterns, and file edit correlations — attributed to sub-projects (a session in project A never sees project B's errors at startup), recency-decayed (errors quiet for 30 days drop out), and causally attributed (a "struggle" requires errors traced to the file, not just edits in error-containing sessions)
Predicts what files and context you'll need before edits
Logs which injected context precedes passing tests;
session_mine(reflect)reports that precision + LLM-synthesized patternsOn first install, retroactively mines your entire session history
On-demand (MCP tools):
memory— store, search, archive, and manage memoriessession_mine— search past conversations (taggable by kind: decision / next-step / error), find decisions, replay file history, detect patterns, and surface what you said you'd do this session (commitments)work— log decisions and mistakes with reasoningPlus: scope guard, context checkpoints (
checkpoint_save/restore/list;handoff_*are deprecated aliases), convention tracking, impact analysis, symbol lookup (deps_map(symbol="X")— defining file, signature, importers from the code index)
All MCP tools carry annotations (read-only / idempotent hints + a title), so clients and permission systems know which are safe to call without a prompt.
Related MCP server: Doclea MCP
A Note From the Author
How I actually use it, since I built it:
Mostly it just works in the background — you don't have to think about it. The few things worth doing on purpose:
Pull
/engramwhen you want Claude to actively reach for the tools — the command loads the reference so Claude knows what's there and uses it. (Background tracking happens either way; this is for the on-demand stuff.)When you half-remember something from a while back ("what did we decide about X?"), ask Claude to mine the sessions for it — it searches everything you've ever discussed, not just what's in context.
If there's something it should never forget, save it as a rule. Rules are scoped: a per-project rule applies to that project; a global one (saved at your workspace root) cascades down to every project under it. Broad conventions → global, project-specific → per-project.
Before compacting, it auto-saves a checkpoint — but I make one with what I'm doing + what's left and ask it to pull that back up after. Resumes a lot cleaner.
When you come back, ask what you said you'd do this session — it skims the live conversation for open loops vs. what's done. It's a best-effort read (not a perfect list), but a quick way to reorient.
The less you poke at it, the better it works.
This is a work in progress — if something's off or you hit a bug, please open an issue.
How It Works
Claude Code
|
+-- Hooks (remind.py) <- Intercepts every tool call
| SessionStart / Edit / Bash / Error / Compact / Stop
|
+-- Session Mining (mining/) <- Background intelligence
| JSONL parser -> Extractors -> Search index -> Pattern detection
|
+-- MCP Server (server.py) <- Tools for manual operations
| memory, session_mine, work, scope, context, ...
|
+-- Scorer/Hook Daemon (scorer_server.py) <- Persistent encoder + warm hook dispatch
TCP localhost, cpu-resident (~1.1GB RAM, zero VRAM parked); bulk
embedding jobs run in a transient GPU worker (embed_worker.py)
that loads, encodes, exits - full VRAM release
High-frequency hooks run as thin clients (one round trip, full
in-process fallback when the daemon is down)Hooks fire on every tool call (1-2s budget each). Heavy processing happens in a background subprocess after session end. The scorer server stays loaded in memory for fast semantic scoring.
Benchmarks
Retrieval (recall@k): LongMemEval 0.966 R@5 / 0.982 R@10 (500 questions), ConvoMem 0.960 (250 items), LoCoMo 0.649 R@10 (~2k questions); ~43ms/query, 112ms cross-session over 7,310 chunks.
Product behavior: eight integration suites green — decision capture (97.8% precision), error auto-capture (100% recall), compaction survival (6/6), multi-project isolation (11/11), edit-loop detection (12/12), session mining (27/27), Obsidian-vault compat (25/25).
Full tables and the tests/bench_*.py reproduction commands are in the library-book.
Compatibility
Platform | What Works | Auto-Capture |
Claude Code (CLI, desktop, VS Code, JetBrains) | Everything | Full — hooks + session mining |
Cursor | MCP tools (memory, search, etc.) | No hooks |
Windsurf | MCP tools | No hooks |
Continue.dev | MCP tools | No hooks |
Zed | MCP tools | No hooks |
Any MCP client | MCP tools | No hooks |
Obsidian vaults | Full (with CLAUDE.md at root) | Full with Claude Code |
Install
git clone https://github.com/20alexl/claude-engram.git
cd claude-engram
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -e . # Core
pip install -e ".[semantic]" # + embedding model for vector search and semantic scoring
python install.py # Configure hooks, MCP server, and /engram skillPer-Project Setup
python install.py --setup /path/to/your/projectOr copy .mcp.json to your project root.
Note: The CLAUDE.md in this repo is engram-specific documentation — it's not required for engram to work. Hooks fire automatically and the /engram skill provides a quick reference on demand. If you already have a CLAUDE.md for your project, keep it as-is and don't copy ours over it. If you want engram docs alongside your project rules, rename it to CLAUDE-ENGRAM.md (or similar) so it doesn't clobber your existing file — Claude will see it when relevant.
Updating
cd claude-engram
git pull
pip install -e ".[semantic]" # Reinstall if dependencies changed
python install.py # Re-run to update hooks and /engram skillHooks and MCP tools pick up code changes immediately (editable install). Reconnect the MCP server in Claude Code (/mcp) to reload the server process.
Data migrations run automatically: a cheap inline check fires on the next SessionStart, and a full migration runs in the background. install.py also runs migrations synchronously (step 9). Migrations are forward-only, idempotent, and downgrade-safe — no data is lost.
Mid-Project Adoption
Already deep in a project? Install normally. On first session, engram auto-detects your existing Claude Code session history and mines it in the background — extracting decisions, mistakes, and patterns from all past conversations. No manual effort.
Key Features
Memory — hybrid search (keyword + vector + rerank, no ChromaDB); path-aware scored injection (top 3 by file/tags/recency/importance, with age shown); tiered hot/cold storage (rules and manual mistakes never archive; stale auto-captured one-off mistakes self-archive to keep banners high-signal); per-sub-project scoping with cascading workspace rules.
Session mining — structural extraction (conversation flow, not template matching) over conversation and tool content; cross-session semantic/keyword/hybrid search, typed by kind (decision / next-step / error) and filterable; session_mine(commitments) reads the live transcript for open loops the post-session index can't see; pattern detection, predictive context, cross-project learning; retroactive bootstrap on first install.
Lifecycle — auto-captured decisions + mistakes; survives compaction (per-project checkpoints in a durable ring); edit-loop detection; subagent-aware; automatic, idempotent, downgrade-safe migrations on upgrade.
Internals, the full feature list, gotchas, and API reference live in the library-book.
Configuration
Variable | Default | Description |
|
| Ollama model — optional. Used only by |
|
| sentence-transformers embedding model (~440MB on first use, ~1.1GB scorer RAM). Decision-capture semantic F1 measured 37.7% ( |
| model native | Matryoshka truncation dim (e.g. |
| smart | Embedding device policy. Unset: the resident daemon and in-process fallbacks stay on |
|
| Job size (texts) at which bulk embedding routes to the transient GPU worker instead of the resident daemon |
|
| Live mining tick interval in seconds — at most one incremental mine per interval, triggered at turn end, keeps search/extractions/code-index fresh during long sessions. |
|
| Days until inactive memories archive |
|
| Embedding server idle timeout (seconds) |
|
| Override the storage location (also the test-isolation seam) |
|
| Prune session-search embedding shards older than N days |
| unset | If set, the Read hook mirrors the last-read file path to this file (statusline integration) |
| unset | Set to |
Reindexing
If search quality is poor or you want to rebuild after an update:
python scripts/reindex.py "/path/to/your/workspace" --force # rebuild search index
python scripts/reindex.py "/path/to/your/workspace" --force --extract # also re-extract decisions/mistakesOr via MCP: session_mine(operation="reindex", mode="bootstrap")
Beyond environment variables, ~/.claude_engram/config.json accepts: embed_model, embed_dim, and lessons_globs (opt-in lessons bridge — list of globs relative to each project root whose dated markdown entries sync as protected memories).
Documentation
Library Book — design philosophy, internals, full usage guide, API reference, gotchas, and changelog.
/engram — slash command with quick tool reference (installed by install.py).
License
MIT
Maintenance
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/20alexl/claude-engram'
If you have feedback or need assistance with the MCP directory API, please join our Discord server