# Ember MCP — Gives Your Claude and Codex Local Temporal Persistent Memory And Reduced Hallucinations
**Local-first memory server for LLMs that uses tiered storage and temporal intelligence to manage context and prevent hallucinations caused by stale data.** Stop re-explaining your stack every time you open a new chat window — Ember gives your AI a permanent memory that follows you from Claude to Cursor, automatically discarding outdated decisions so you never get code based on the architecture you abandoned last month.


<!-- mcp-name: io.github.timolabsai/ember-mcp -->
Ember MCP is a Model Context Protocol server that provides LLMs with long-term memory without cloud dependencies. Unlike standard vector stores that simply retrieve the most similar chunks, Ember actively manages knowledge density, freshness, and relevance through a 4-tier memory system with automatic promotion and decay. If you built an authentication system using JWTs six months ago but migrated to OAuth last week, Ember ensures the AI suggests OAuth patterns, not the deprecated JWT code. When you dump months of meeting transcripts into the chat, it distinguishes between the feature requirements you set in January and the pivot you decided on in March. Best of all, this context follows you everywhere: debug a backend issue in Claude Desktop during the day, and continue refactoring the exact same context in Cursor at night.
## Key Features
- **Cross-Session:** Memories persist across conversations and different MCP clients. Close your laptop on Friday, open a completely fresh chat session on Monday, and the AI picks up exactly where you left off without needing a summary.
- **100% Privacy:** Runs locally on CPU (~300MB disk, ~150MB RAM). No API keys or cloud vector DBs required. Paste client NDAs, proprietary algorithms, or financial records without worry — your data never leaves your machine or touches a cloud vector store.
- **4-Tier Memory:** Memories flow through four tiers — working, session, relational, and glacier — with automatic promotion based on access frequency. Frequently used knowledge stays hot; stale data decays to cold storage without manual cleanup.
- **Temporal Intelligence:** Memories are ranked by HESTIA scoring — a composite of recency, access frequency, importance, and shadow decay. Stale data is explicitly penalized. Tell the AI you're using React 17 today, and when you upgrade to React 19 next month, it understands the old syntax is history.
- **Dual Search:** Semantic similarity search with automatic FTS5 full-text fallback. You never get empty results — if the embedding doesn't match, the keywords will.
- **CORAL Checkpointing:** Save and resume complex task state mid-session. When context limits hit during a multi-step refactor, checkpoint the work and pick up exactly where you left off.
- **Knowledge Graph:** Typed edges between memories (depends_on, child_of, context_for) enable graph traversal. Ask about a deployment decision and Ember surfaces the infrastructure choices it depends on.
- **Source Linking:** Memories trace back to their origin files, allowing deep recall of source context. When the AI claims "we decided to use Kubernetes," it points you to the specific meeting note or architecture doc where that decision was recorded.
- **Zero-Config:** Works automatically via embedded server instructions. No vector databases to spin up, no embeddings to configure, and no "memory dashboard" to manually prune — it just runs in the background.
## Why Temporal Intelligence Matters
Standard vector databases suffer from "semantic collision" when a project evolves. Old, obsolete memories often have high semantic similarity to new queries, causing the LLM to hallucinate based on past states. Ember solves this through Shadow-Decay — automatically detecting and penalizing outdated information.
### What This Means for You
You've been working on a project for months, and inevitably, requirements have changed and tech stacks have evolved. Without temporal intelligence, an AI sees your entire history as equally valid, confidently suggesting the library you abandoned two months ago. Ember detects when information has been superseded — like a migration from Redux to Zustand — and automatically shadows the old data. You get answers based on your current architecture, not the ghosts of your previous decisions, without ever having to manually "clean up" the AI's memory.
### The Scenario
**January:** You tell Claude your project uses **PostgreSQL**. Ember stores memories about schemas and drivers.
**April:** You migrate to **MongoDB**. You store new memories about documents and collections.
### Without Ember (Standard Vector Store)
The old "we use PostgreSQL" memory remains semantically similar to questions about "database queries."
**Result:** Claude confidently provides PostgreSQL SQL syntax, hallucinating based on stale memory despite your migration.
### With Ember Shadow-Decay
1. **Detection:** When you store MongoDB memories that contradict PostgreSQL ones, Ember marks the older memories as shadowed.
2. **Scoring:** The HESTIA scoring formula penalizes shadowed memories — they rank 10x lower than active knowledge.
3. **Result:** When you ask about databases, the shadowed PostgreSQL memory is suppressed. Claude retrieves only the active MongoDB context.
## Get Started
→ **[ember.timolabs.dev](https://ember.timolabs.dev/)**
The installer automatically:
- Detects which MCP clients you have installed (Claude Desktop, Claude Code, Cursor, Windsurf)
- Registers Ember with each one — no JSON editing required
- Creates the local storage directory
- Downloads the embedding model
Restart your AI client and you're ready to go. Your AI now has persistent memory.
### Verify
```bash
ember-mcp status
```
Shows which clients are registered and how many memories are stored.
### Manual Configuration
If you use an MCP client that isn't auto-detected, add this to its config:
```json
{
"mcpServers": {
"ember": {
"command": "ember-mcp",
"args": []
}
}
}
```
## How It Works
Ember combines temporal scoring with tiered storage to manage knowledge relevance automatically:
### What This Means for You
You don't need to learn a new syntax or manage a "knowledge base." You just talk to your AI as you normally would. Behind the scenes, Ember organizes your conversations into a 4-tier memory system, tracks which information is fresh versus stale, and injects only the relevant context into your current session.
1. **Local Embeddings:** Uses `all-MiniLM-L6-v2` to generate 384-dimensional vectors locally on CPU.
2. **4-Tier Storage:** Memories flow through working → session → relational → glacier tiers with TTL-based expiry and access-based promotion.
3. **HESTIA Scoring:** Ranks memories by a composite of cosine similarity, shadow penalty, importance weight, recency decay, and access frequency.
4. **Dual Search:** Semantic similarity search with automatic FTS5 full-text fallback — ensures you always get results.
5. **Shadow-Decay:** When new information contradicts old, the old memory is shadowed and penalized in rankings without being deleted.
6. **Knowledge Graph:** Typed edges (depends_on, child_of, context_for, related, shadow, supersedes) enable BFS traversal across connected memories.
7. **Consolidation Engine:** Periodically promotes frequently accessed memories to higher tiers and demotes unused ones.
## Tools
Ember exposes **19 tools** to the LLM:
| Tool | Description |
|------|-------------|
| `ember_store` | Save a named memory with importance level, optional tags, status, and typed graph edges |
| `ember_recall` | Semantic search with temporal scoring across all memories |
| `ember_read` | Read the full content of a specific ember by ID |
| `ember_deep_recall` | Recall + automatically read source files behind the embers |
| `ember_learn` | Auto-capture key information from conversation (facts, preferences, decisions) |
| `ember_contradict` | Mark outdated memory stale and store corrected version |
| `ember_list` | List all stored memories, optionally filtered by tag |
| `ember_delete` | Remove a memory by ID |
| `ember_inspect` | View tier distribution, memory statistics, and density |
| `ember_auto` | Auto-retrieve relevant context at conversation start with temporal ranking |
| `ember_save_session` | Save session summary, decisions, and next steps with source linking |
| `ember_drift_check` | Run drift detection — flag stale memories in shifting knowledge regions |
| `ember_graph_search` | Vector search + BFS traversal via typed knowledge graph edges |
| `ember_actionable` | List embers with active task status (open or in_progress) |
| `ember_set_status` | Update the task status of an existing ember |
| `ember_compact` | AI-powered compaction of stale/shadowed embers — analyze candidates or apply LLM-generated summary |
| `ember_health` | Compute hallucination risk metrics across the memory store |
| `ember_recompute_shadows` | Full recalculation of shadow_load for every ember |
| `ember_explain` | Return HESTIA score breakdown for a specific ember |
## Prompts
| Prompt | Description |
|--------|-------------|
| `start_session` | Load memory at conversation start |
| `end_session` | Save context before ending |
| `remember` | Store something user wants persisted |
## Storage
All data is stored locally in a single SQLite database at `~/.ember-v3/`.
```
~/.ember-v3/
└── ember.db # SQLite: memories, edges, FTS5 index, checkpoints
```
To reset: `rm -rf ~/.ember-v3`
To backup: copy the `~/.ember-v3` directory.
### Migrating from v1/v2
If you have existing memories in `~/.ember/`, run:
```bash
ember-mcp migrate
```
This migrates all memories, edges, and re-generates embeddings for the new storage format.
## Requirements
- **Python:** 3.11+
- **Disk:** ~300MB (embedding model + dependencies)
- **RAM:** ~150MB overhead
- **OS:** macOS, Linux, Windows (WSL)
## License
MIT — Built by [Timo Labs](https://github.com/TimoLabsAI)