Context Fabric
Integrates with the Google Gemini CLI to provide semantic memory, pattern detection, and time-aware session orientation.
Enables persistent context management and architectural decision tracking for the OpenAI Codex CLI across multiple development sessions.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Context FabricRecall the key architectural decisions from our last session."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Context Fabric
Local-first MCP memory for AI coding agents.
Your agent remembers decisions, patterns, project context, and what changed while you were away — across sessions, projects, and tools.
Pre-1.0, but built for daily use. Context Fabric is actively used, tested, and released regularly. APIs and storage formats may still evolve before 1.0, so pin versions and review the CHANGELOG before upgrading.
Start Here
Want to try it fast? Go to Quick Start
Need a client config? Open CLI Setup
Want the full tool surface? See Tools Reference
Prefer guided docs? Browse the Wiki
Why it exists
Coding agents are great in-session and forgetful between sessions. Important context disappears when the terminal closes: decisions, debugging discoveries, codebase conventions, partial work, and the answer to "what changed since I was last here?"
Context Fabric gives MCP-compatible coding agents a persistent memory layer that stays local, searchable, and useful.
Who it's for
Developers using MCP-capable coding tools like Claude Code, Cursor, Codex CLI, Gemini CLI, OpenCode, or Kimi
Teams that want persistent agent memory without sending code and context to a hosted memory service
Builders who want a lightweight local memory substrate instead of wiring up a separate vector database stack
Why Context Fabric
Local-first by design — SQLite storage, local embeddings, Docker/local deployment, zero cloud dependency.
Built for coding agents — remembers decisions, bug fixes, conventions, code patterns, and current project state.
MCP-native — works as a real MCP server with Tools, Resources, and Prompts.
Code-aware and time-aware — semantic code search, symbol indexing, and orientation around offline gaps.
Practical to adopt — no external vector database, no API key, no hosted control plane required.
What you get
Memory & retrieval
Three-layer memory — Working (L1), Project (L2), Semantic (L3). Memories auto-route to the right layer.
Hybrid search — FTS5 BM25 + vector cosine + Reciprocal Rank Fusion. Query-side instruction prefixes applied automatically per embedder family (BGE, E5, MiniLM). Optional explanations expose component scores and boosts without changing default ranking.
Semantic recall — in-process vector embeddings via ONNX + fastembed (
bge-small-en-v1.5by default, with one-env-var swap to larger models or GPU). No API keys needed.Bundled ANN — sqlite-vec ships as a regular dependency (since v0.13). KNN over the full corpus, graceful fallback if the loadable extension fails to attach.
Optional CUDA inference — set
CONTEXT_FABRIC_EMBED_EP=cuda+ runscripts/setup-gpu.shfor ~30× ingest throughput on NVIDIA hardware.Local code indexing — scans source files, extracts symbols, stays fresh via file watching, and can inspect/repair stale or corrupted index state.
Time-aware orientation — "What changed while I was away?" with offline-gap detection and timezone support.
Ghost messages — relevant memories surface via
context.getCurrentwithout cluttering the main workflow.Public-benchmark harness — reproducible BEIR SciFact / FiQA and LongMemEval_S runners in
benchmarks/public/.
Memory intelligence
Provenance — structured citation blocks on memories (
sessionId,eventId,filePath,commitSha,sourceUrl, and more).Dedup-on-store — cosine near-duplicate detection for L3 with
skip,merge, orallowstrategies.Bi-temporal memory —
supersedes,validFrom, andvalidUntilsupport for "what was true at that time?" reasoning.Scoped fabric graph — temporal entities and relationships connect projects, sessions, files, symbols, memories, decisions, errors, and skills for lineage/path queries.
Agent ergonomics
Skills — procedural memory with slugged, invokable instruction blocks and usage tracking.
MCP Resources — browseable
memory://skills,memory://recent,memory://conventions,memory://decisions, and templated resource views.MCP Prompts — slash-command workflows like
cf-orient,cf-capture-decision,cf-review-session,cf-search-code, andcf-invoke-skill.context.importDocs— one-shot seeding from common onboarding docs likeREADME.md,CHANGELOG.md,CONTRIBUTING.md, andAGENTS.md.Recall-quality harness — benchmark recall@k and MRR with
npm run bench:quality.
Operations & DX
29 MCP tools — memory CRUD, recall/orientation, code search, code-index repair, graph query/import/export, docs import, backup/export/import, metrics/health, and 6 skill tools.
Graceful shutdown — drains in-flight calls, checkpoints WAL, and closes cleanly.
Data integrity — startup checks, explicit multi-row transactions, and online backups.
Observability — structured logging plus
context.metricsandcontext.health.Self-setup —
context.setupcan install Context Fabric into supported CLIs.Docker-first — easy
docker run --rm -itransport with persistent named-volume storage.
Quick Start
Get running in a few minutes:
# 1. Clone and build the Docker image
git clone https://github.com/Abaddollyon/context-fabric.git
cd context-fabric
docker build -t context-fabric .
# 2. Verify the server responds
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' \
| docker run --rm -i context-fabric3. Add it to your CLI with the Docker transport:
docker run --rm -i -v context-fabric-data:/data/.context-fabric context-fabricSee CLI Setup for copy-paste configs for all supported CLIs, or let your AI do it once Context Fabric is reachable:
"Install and configure Context Fabric for Cursor using Docker"
Requires Node.js 22.5+:
git clone https://github.com/Abaddollyon/context-fabric.git
cd context-fabric
npm install
npm run buildThe server is at dist/server.js. Point your CLI MCP config at node dist/server.js.
Supported CLIs
CLI | Setup | Docs |
Claude Code |
| |
Kimi |
| |
OpenCode |
| |
Codex CLI |
| |
Gemini CLI |
| |
Cursor |
| |
Claude Desktop |
|
Once Context Fabric is running in one MCP-capable tool, it can usually install itself into the others throughcontext.setup.
What it feels like
Start a session and the agent orients itself:
It is 9:15 AM on Wednesday, Feb 25 (America/New_York).
Project: /home/user/myapp.
Last session: 14 hours ago. 3 new memories were added while you were away.Store a decision once:
{ "type": "decision", "content": "Use Zod for all API validation. Schemas live in src/schemas/." }Recall it naturally later:
{ "query": "how do we validate inputs?" }
// => "Use Zod for all API validation. Schemas live in src/schemas/."No cloud account. No hidden service dependency. No need to re-explain the codebase every session.
Performance
Numbers on a commodity dev box (Ryzen 7 5800H + RTX 3060 12 GB, warm run, 2026-04-28/29):
Retrieval quality — public benchmarks
Benchmark | Metric | Context Fabric (v0.14 rerun, GPU) | v0.13 published | OpenAI | bge-base-en-v1.5 (dense-only) |
BEIR SciFact | nDCG@10 | 0.7456 | 0.7439 | 0.774 | 0.740 |
BEIR SciFact | Recall@100 | 0.9633 | 0.9667 | ~0.93 | — |
BEIR FiQA-2018 | nDCG@10 | 0.3809 | 0.3801 | 0.397 | 0.406 |
BEIR FiQA-2018 | Recall@100 | 0.7360 | 0.7361 | ~0.69 | — |
LongMemEval_S (500 q, 25K sessions) | Hit@5 | 0.9200 | 0.9520 | — | — |
LongMemEval_S | Recall@10 | 0.9210 | 0.9472 | — | — |
Reading this: v0.14 keeps the v0.13 low-latency local retrieval path while adding explanation/artifact tooling for ranking diagnostics. BEIR top-k quality improved slightly in the rerun; LongMemEval's historical v0.13 number did not reproduce under the current cached runtime/dataset environment, so docs/benchmarks.md now documents both the published baseline and the v0.14 rerun with artifact output for regression analysis.
Latency and throughput
Workload | Result |
BEIR SciFact query p50 (bge-base, GPU + sqlite-vec) | 20 ms |
BEIR FiQA query p50 (bge-base, GPU + sqlite-vec) | 87 ms |
LongMemEval_S query p50 (embedding-only, artifact-capable) | 10.8 ms |
L3 | ~8 ms p50, <100 ms p99 |
Ingest throughput (bge-base, RTX 3060 CUDA EP) | ~170 docs/s (≈32× the CPU single-core baseline) |
Full test suite | 747 tests passing |
Incremental | ~0.8 s |
Server cold start (with L3 warm) | < 1 s |
Benchmark scripts: benchmarks/recall-latency.ts (npm run bench), benchmarks/public/ (npm run bench:beir:scifact, npm run bench:beir:fiqa, npm run bench:longmemeval:s).
Architecture at a glance
CLI (Claude Code, Cursor, Codex, etc.)
|
| MCP protocol (stdio / Docker)
v
Context Fabric Server
|-- Smart Router -----> L1: Working Memory (in-memory, session-scoped)
|-- Time Service L2: Project Memory (SQLite, per-project)
|-- Code Index L3: Semantic Memory (SQLite + embeddings, cross-project)Memories auto-route to the right layer. Scratchpad notes go to L1. Decisions and bug fixes go to L2. Reusable patterns and conventions go to L3. See Architecture for the full deep dive.
Documentation
Resource | Description |
Installation, first run, Docker and local setup | |
Per-CLI configuration for all 7 supported CLIs | |
Full docs for all 29 MCP tools | |
Procedural memory — create, invoke, and compose reusable skills | |
Resources ( | |
Type system, layers, routing, decay, provenance, and dedup | |
Storage paths, TTL, embedding notes, and environment variables | |
System-prompt guidance for automatic tool usage | |
Retrieval pipeline, internals, and performance design | |
Public-benchmark results (BEIR SciFact / FiQA, LongMemEval_S) with reproduction commands | |
Version history and upgrade notes | |
Launch-friendly guides, FAQ, troubleshooting, and setup walkthroughs |
Contributing
Contributions are welcome. See CONTRIBUTING.md for how to get started.
License
Stop re-explaining your codebase every session.
Get Started · Configure a CLI · Browse the Wiki · Report a Bug
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Appeared in Searches
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Abaddollyon/context-fabric'
If you have feedback or need assistance with the MCP directory API, please join our Discord server