Which integrations are available for this server?

Integrates with the Google Gemini CLI to provide semantic memory, pattern detection, and time-aware session orientation. Enables persistent context management and architectural decision tracking for the OpenAI Codex CLI across multiple development sessions.

How do I use Context Fabric?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Context Fabric Recall the key architectural decisions from our last session." That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

de en es ja ko ru zh

Context Fabric

by Abaddollyon

Overview Schema Related Servers Score Discussions

TypeScript

Hybrid

Context Fabric

Local-first MCP memory for AI coding agents.

Your agent remembers decisions, patterns, project context, and what changed while you were away — across sessions, projects, and tools.

Version Tests License Node Docker

NOTE

Pre-1.0, but built for daily use. Context Fabric is actively used, tested, and released regularly. APIs and storage formats may still evolve before 1.0, so pin versions and review the CHANGELOG before upgrading.

Start Here

Want to try it fast? Go to Quick Start
Need a client config? Open CLI Setup
Want the full tool surface? See Tools Reference
Prefer guided docs? Browse the Wiki

Why it exists

Coding agents are great in-session and forgetful between sessions. Important context disappears when the terminal closes: decisions, debugging discoveries, codebase conventions, partial work, and the answer to "what changed since I was last here?"

Context Fabric gives MCP-compatible coding agents a persistent memory layer that stays local, searchable, and useful.

Who it's for

Developers using MCP-capable coding tools like Claude Code, Cursor, Codex CLI, Gemini CLI, OpenCode, or Kimi
Teams that want persistent agent memory without sending code and context to a hosted memory service
Builders who want a lightweight local memory substrate instead of wiring up a separate vector database stack

Why Context Fabric

Local-first by design — SQLite storage, local embeddings, Docker/local deployment, zero cloud dependency.
Built for coding agents — remembers decisions, bug fixes, conventions, code patterns, and current project state.
MCP-native — works as a real MCP server with Tools, Resources, and Prompts.
Code-aware and time-aware — semantic code search, symbol indexing, and orientation around offline gaps.
Practical to adopt — no external vector database, no API key, no hosted control plane required.

What you get

Memory & retrieval

Three-layer memory — Working (L1), Project (L2), Semantic (L3). Memories auto-route to the right layer.
Hybrid search — FTS5 BM25 + vector cosine + Reciprocal Rank Fusion. Query-side instruction prefixes applied automatically per embedder family (BGE, E5, MiniLM). Optional explanations expose component scores and boosts without changing default ranking.
Semantic recall — in-process vector embeddings via ONNX + fastembed (bge-small-en-v1.5 by default, with one-env-var swap to larger models or GPU). No API keys needed.
Bundled ANN — sqlite-vec ships as a regular dependency (since v0.13). KNN over the full corpus, graceful fallback if the loadable extension fails to attach.
Optional CUDA inference — set CONTEXT_FABRIC_EMBED_EP=cuda + run scripts/setup-gpu.sh for ~30× ingest throughput on NVIDIA hardware.
Local code indexing — scans source files, extracts symbols, stays fresh via file watching, and can inspect/repair stale or corrupted index state.
Time-aware orientation — "What changed while I was away?" with offline-gap detection and timezone support.
Ghost messages — relevant memories surface via context.getCurrent without cluttering the main workflow.
Public-benchmark harness — reproducible BEIR SciFact / FiQA and LongMemEval_S runners in benchmarks/public/.

Memory intelligence

Provenance — structured citation blocks on memories (sessionId, eventId, filePath, commitSha, sourceUrl, and more).
Dedup-on-store — cosine near-duplicate detection for L3 with skip, merge, or allow strategies.
Bi-temporal memory — supersedes, validFrom, and validUntil support for "what was true at that time?" reasoning.
Scoped fabric graph — temporal entities and relationships connect projects, sessions, files, symbols, memories, decisions, errors, and skills for lineage/path queries.

Agent ergonomics

Skills — procedural memory with slugged, invokable instruction blocks and usage tracking.
MCP Resources — browseable memory://skills, memory://recent, memory://conventions, memory://decisions, and templated resource views.
MCP Prompts — slash-command workflows like cf-orient, cf-capture-decision, cf-review-session, cf-search-code, and cf-invoke-skill.
context.importDocs — one-shot seeding from common onboarding docs like README.md, CHANGELOG.md, CONTRIBUTING.md, and AGENTS.md.
Recall-quality harness — benchmark recall@k and MRR with npm run bench:quality.

Operations & DX

29 MCP tools — memory CRUD, recall/orientation, code search, code-index repair, graph query/import/export, docs import, backup/export/import, metrics/health, and 6 skill tools.
Graceful shutdown — drains in-flight calls, checkpoints WAL, and closes cleanly.
Data integrity — startup checks, explicit multi-row transactions, and online backups.
Observability — structured logging plus context.metrics and context.health.
Self-setup — context.setup can install Context Fabric into supported CLIs.
Docker-first — easy docker run --rm -i transport with persistent named-volume storage.

Quick Start

Get running in a few minutes:

# 1. Clone and build the Docker image

git clone https://github.com/Abaddollyon/context-fabric.git
cd context-fabric
docker build -t context-fabric .

# 2. Verify the server responds

echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' \
  | docker run --rm -i context-fabric

3. Add it to your CLI with the Docker transport:

docker run --rm -i -v context-fabric-data:/data/.context-fabric context-fabric

See CLI Setup for copy-paste configs for all supported CLIs, or let your AI do it once Context Fabric is reachable:

"Install and configure Context Fabric for Cursor using Docker"

Requires Node.js 22.5+:

git clone https://github.com/Abaddollyon/context-fabric.git
cd context-fabric
npm install
npm run build

The server is at dist/server.js. Point your CLI MCP config at node dist/server.js.

Supported CLIs

CLI	Setup	Docs
Claude Code	`context.setup({ cli: "claude-code" })`	Guide
Kimi	`context.setup({ cli: "kimi" })`	Guide
OpenCode	`context.setup({ cli: "opencode" })`	Guide
Codex CLI	`context.setup({ cli: "codex" })`	Guide
Gemini CLI	`context.setup({ cli: "gemini" })`	Guide
Cursor	`context.setup({ cli: "cursor" })`	Guide
Claude Desktop	`context.setup({ cli: "claude" })`	Guide

TIP

Once Context Fabric is running in one MCP-capable tool, it can usually install itself into the others throughcontext.setup.

What it feels like

Start a session and the agent orients itself:

It is 9:15 AM on Wednesday, Feb 25 (America/New_York).
Project: /home/user/myapp.
Last session: 14 hours ago. 3 new memories were added while you were away.

Store a decision once:

{ "type": "decision", "content": "Use Zod for all API validation. Schemas live in src/schemas/." }

Recall it naturally later:

{ "query": "how do we validate inputs?" }
// => "Use Zod for all API validation. Schemas live in src/schemas/."

No cloud account. No hidden service dependency. No need to re-explain the codebase every session.

Performance

Numbers on a commodity dev box (Ryzen 7 5800H + RTX 3060 12 GB, warm run, 2026-04-28/29):

Retrieval quality — public benchmarks

Benchmark	Metric	Context Fabric (v0.14 rerun, GPU)	v0.13 published	OpenAI `text-embedding-3-small`	bge-base-en-v1.5 (dense-only)
BEIR SciFact	nDCG@10	0.7456	0.7439	0.774	0.740
BEIR SciFact	Recall@100	0.9633	0.9667	~0.93	—
BEIR FiQA-2018	nDCG@10	0.3809	0.3801	0.397	0.406
BEIR FiQA-2018	Recall@100	0.7360	0.7361	~0.69	—
LongMemEval_S (500 q, 25K sessions)	Hit@5	0.9200	0.9520	—	—
LongMemEval_S	Recall@10	0.9210	0.9472	—	—

Reading this: v0.14 keeps the v0.13 low-latency local retrieval path while adding explanation/artifact tooling for ranking diagnostics. BEIR top-k quality improved slightly in the rerun; LongMemEval's historical v0.13 number did not reproduce under the current cached runtime/dataset environment, so docs/benchmarks.md now documents both the published baseline and the v0.14 rerun with artifact output for regression analysis.

Latency and throughput

Workload	Result
BEIR SciFact query p50 (bge-base, GPU + sqlite-vec)	20 ms
BEIR FiQA query p50 (bge-base, GPU + sqlite-vec)	87 ms
LongMemEval_S query p50 (embedding-only, artifact-capable)	10.8 ms
L3 `recall()` @ 10K memories (FTS5 prefilter, CPU)	~8 ms p50, <100 ms p99
Ingest throughput (bge-base, RTX 3060 CUDA EP)	~170 docs/s (≈32× the CPU single-core baseline)
Full test suite	747 tests passing
Incremental `tsc` rebuild	~0.8 s
Server cold start (with L3 warm)	< 1 s

Benchmark scripts: benchmarks/recall-latency.ts (npm run bench), benchmarks/public/ (npm run bench:beir:scifact, npm run bench:beir:fiqa, npm run bench:longmemeval:s).

Architecture at a glance

CLI (Claude Code, Cursor, Codex, etc.)
  |
  | MCP protocol (stdio / Docker)
  v
Context Fabric Server
  |-- Smart Router -----> L1: Working Memory   (in-memory, session-scoped)
  |-- Time Service        L2: Project Memory   (SQLite, per-project)
  |-- Code Index          L3: Semantic Memory  (SQLite + embeddings, cross-project)

Memories auto-route to the right layer. Scratchpad notes go to L1. Decisions and bug fixes go to L2. Reusable patterns and conventions go to L3. See Architecture for the full deep dive.

Documentation

Resource	Description
Getting Started	Installation, first run, Docker and local setup
CLI Setup	Per-CLI configuration for all 7 supported CLIs
Tools Reference	Full docs for all 29 MCP tools
Skills	Procedural memory — create, invoke, and compose reusable skills
MCP Primitives	Resources (`memory://...`) and Prompts (`cf-*`)
Memory Types	Type system, layers, routing, decay, provenance, and dedup
Configuration	Storage paths, TTL, embedding notes, and environment variables
Agent Integration	System-prompt guidance for automatic tool usage
Architecture	Retrieval pipeline, internals, and performance design
Benchmarks	Public-benchmark results (BEIR SciFact / FiQA, LongMemEval_S) with reproduction commands
Changelog	Version history and upgrade notes
Wiki	Launch-friendly guides, FAQ, troubleshooting, and setup walkthroughs

Contributing

Contributions are welcome. See CONTRIBUTING.md for how to get started.

License

MIT

Stop re-explaining your codebase every session.

Get Started · Configure a CLI · Browse the Wiki · Report a Bug

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

3dRelease cycle

17Releases (12mo)

Resources

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Appeared in Searches

Python context management and dependency solving in ComfyUI

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Abaddollyon/context-fabric'

If you have feedback or need assistance with the MCP directory API, please join our Discord server