Gingugu
Provides cross-session memory for Windsurf (Codeium) AI assistant, supporting knowledge retention and recall.
๐ง Gingugu
Your AI forgets everything between sessions. Gingugu fixes that.
Gingugu is a local MCP server that gives AI coding assistants a real long-term brain โ persistent, structured, searchable memory that survives across sessions, repos, and projects. No cloud, no API keys, no telemetry. One SQLite file on your machine.
๐ Table of Contents
Related MCP server: mnemos
Why Gingugu
Every session with an AI assistant starts from zero. The decisions you made yesterday, the bug you fixed last week, the architecture you settled on a month ago โ gone. Existing memory tools dump observations into a flat pile with no structure, no staleness tracking, no relationships, and no sense of what's relevant right now.
Gingugu is designed to be the actual brain โ not a junk drawer:
Remembers across sessions, repos, and projects
Organizes knowledge by namespace, type, and relationships
Ranks memories by relevance, freshness, and confidence
Auto-surfaces relevant context when you start working
Consolidates duplicate and related knowledge on demand
Where this goes long-term โ federated, org-wide agent memory โ lives in docs/enterprise-vision.md.
How It Compares
Gingugu | mem0 | Zep | OpenMemory MCP | Letta (MemGPT) | Claude Projects / Cursor / Windsurf | |
Truly local-first (no cloud calls) | โ | โ ๏ธ cloud-sync default | โ | โ ๏ธ | โ ๏ธ | โ |
Works across all your AI tools | โ MCP-native | โ ๏ธ SDK-dependent | โ ๏ธ | โ MCP-native | โ framework lock-in | โ tool lock-in |
Zero ongoing cost | โ | โ paid tier | โ LLM calls + Postgres | โ paid tier | โ ๏ธ | โ |
Hybrid search (BM25 + semantic) | โ built-in, local | โ ๏ธ paid tier | โ | โ ๏ธ | โ ๏ธ | โ |
Knowledge graph built-in | โ relations + tags | โ ๏ธ paid tier | โ LLM-extracted (best in class) | โ ๏ธ | โ | โ |
Auto entity/relation extraction | โ (explicit) | โ ๏ธ paid | โ | โ ๏ธ | โ | โ |
Credential vault | โ OS keychain | โ | โ | โ | โ | โ |
Knowledge graph UI | โ | โ | โ ๏ธ cloud dashboard | โ | โ | โ |
Deployment footprint | One SQLite file | SDK + cloud | Postgres + cloud | SDK + cloud | Full framework | None (built-in) |
The honest take: Zep has the most sophisticated knowledge graph โ they auto-extract entities and relations using LLMs. We don't (yet). But theirs costs LLM calls per memory, needs Postgres, and lives in the cloud. Ours is one SQLite file, free forever, and offline-capable.
Where Gingugu wins outright: the trifecta of local-first, cross-tool, and zero-cost forever. Nobody else hits all three.
FAQ
Those are great if you live in one tool. The moment you switch between Claude Code in the morning and Cursor in the afternoon, the memory is gone. Gingugu's memory follows you across every MCP client, lives on your machine, and is programmable (16 tools, structured types, relationships, confidence levels). The built-ins are convenience features. Gingugu is infrastructure.
Both, actually. We do hybrid retrieval out of the box: BM25 over FTS5 +
local semantic embeddings (via fastembed,
no PyTorch dependency), fused with Reciprocal Rank Fusion. No vector DB
server required.
Why this stack:
No deployment. One SQLite file holds memories, FTS5 index, and embeddings. No Postgres, no Pinecone, no Chroma server.
ONNX over PyTorch. fastembed ships the embedding model as a ~50MB ONNX runtime instead of 2GB of PyTorch โ the install footprint stays honest to the "one SQLite file" promise.
It composes. Hybrid relevance feeds the composite (relevance ร freshness ร access ร confidence) โ every signal in one engine.
You can disable semantic search via MEMORY_EMBEDDINGS_ENABLED=false and
fall back to BM25-only. Swap the model via MEMORY_EMBEDDINGS_MODEL (any
fastembed-supported model โ defaults to BAAI/bge-small-en-v1.5).
Yes. 138 tests passing. Self-hosted in this repo (the memories you see referenced in commits are Gingugu memories). WAL mode for concurrency. Hardened against adversarial input and write contention. CI matrix across Python 3.11โ3.13 on Linux/macOS/Windows.
SQLite FTS5 comfortably handles millions of rows. Gingugu adds composite
re-ranking on top, but only over a small candidate pool (4ร limit). For
personal/team use you'll never hit a wall. Use memory_consolidate to
merge duplicates or summarize clusters when things sprawl.
It's a local CLI/server tool. Python's SQLite + keyring + asyncio story is
mature, the install footprint via uv is small, and there's no JS bundling
or Rust toolchain required to use it. The MCP SDK is first-class in Python.
Features
Feature | Description |
๐ท๏ธ Namespace Scoping | Memories auto-scoped to repos/projects with cross-repo pattern sharing |
๐ Hybrid Search | SQLite FTS5 (BM25) + local semantic embeddings via fastembed, fused with Reciprocal Rank Fusion โ no PyTorch, no API calls |
โฐ Temporal Intelligence | Trust-led scoring, dormancy tracking (never forgets), "last confirmed" tracking, spreading activation |
๐ Relationships | Link memories: supersedes, related_to, caused_by, contradicts |
๐ฏ Confidence Levels | verified โ inferred โ stale โ deprecated lifecycle |
๐งน Consolidation Tools | Merge duplicates, summarize clusters, deduplicate on demand |
๐ Auto-Context | Surfaces relevant memories on session start โ zero manual effort |
๐ Health Metrics | Memory stats, dormancy reports, namespace overviews |
๐ Credential Vault | Secure service-bundle storage for API keys/tokens via OS Keychain |
๐ Memory Explorer UI | Interactive knowledge graph + dashboard for visualizing memory data |
Architecture
graph TD
A[AI Assistant<br/>any MCP client] -->|MCP Protocol| B[Gingugu Server]
B --> C[Search Engine<br/>FTS5 + BM25]
B --> D[Storage Layer<br/>SQLite + WAL]
B --> E[Decay Engine<br/>Scoring + Pruning]
B --> F[Context Engine<br/>Auto-Retrieval]
B --> H[Consolidation Engine<br/>Merge + Dedupe]
B --> K[Credential Vault]
C --> D
E --> D
F --> D
H --> D
K --> D
K --> J[OS Keychain<br/>via keyring]
D --> G[(~/.local/share/gingugu/memories.db)]See docs/architecture.md for full technical details.
Setup
Prerequisites
Python 3.11+
uv(recommended) orpipmacOS, Linux, or Windows โ the credential vault uses your OS-native secret store via
keyring(macOS Keychain, Windows Credential Locker, Linux Secret Service/KWallet). On headless Linux without a Secret Service backend, everything works except storing secrets.
Install
# Recommended: uv (fast, manages Python for you)
uv tool install gingugu
# Or with pip
pip install ginguguThat's it. The gingugu command is now on your PATH.
git clone https://github.com/gingugu/gingugu.git && cd gingugu
uv sync
uv run gingugu # or pip install -e .Production-ready. 16 MCP tools live. 138 tests passing. Dogfooded daily in Windsurf โ this repo's own memories live in a Gingugu database.
Configure Your MCP Client
Gingugu speaks standard MCP over stdio โ it works with any MCP client. Claude Code, Claude Desktop, Cursor, Cline, and Windsurf are all first-class.
Add to ~/.codeium/windsurf/mcp_config.json โ a ready-to-edit template lives
at examples/mcp_config.json:
{
"mcpServers": {
"gingugu": {
"command": "uv",
"args": ["--directory", "/ABSOLUTE/PATH/TO/gingugu", "run", "gingugu"]
}
}
}โ ๏ธ Windsurf's
mcp_config.jsonis global, not per-workspace, and it only interpolates${env:VAR}/${file:path}โ not${workspaceFolder}. So a single server instance serves every repo.
claude mcp add gingugu -- uv --directory /ABSOLUTE/PATH/TO/gingugu run ginguguOr add the standard mcpServers block (as in the Windsurf example) to
.mcp.json in your project root for a per-repo setup.
Add the same mcpServers block to
~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or
%APPDATA%\Claude\claude_desktop_config.json (Windows).
Add the same mcpServers block to ~/.cursor/mcp.json (global) or
.cursor/mcp.json in your repo (per-project).
Cline โ MCP Servers โ Configure: add the same mcpServers block to
cline_mcp_settings.json.
Any client that supports stdio MCP servers works โ point it at:
command: uv
args: ["--directory", "/ABSOLUTE/PATH/TO/gingugu", "run", "gingugu"]Scoping memories per repo: when your client's config is global (it can't
see the active workspace), the assistant passes a namespace argument on each
memory tool call (every tool accepts one). To instead pin a server instance to
a single project, set a static MEMORY_NAMESPACE in the env block. See
docs/architecture.md โ Namespace Auto-Detection for the full resolution
order.
Configure Your AI Agent
The MCP server gives your assistant the tools, but it won't use them effectively without instructions. Add the memory protocol below to your agent's rules file so it knows when and how to call them.
Which file? Depends on your IDE / tool:
IDE / Tool | Rules File | Scope |
Windsurf |
| Per-workspace |
Cursor |
| Per-workspace |
Cline |
| Per-workspace |
Codex / OpenAI |
| Per-repo |
Any (global) | Your IDE's global rules/system prompt | All workspaces |
Paste this into your rules file (adjust the project namespace and tool prefix to match your MCP config name):
## Memory Protocol
Gingugu is your long-term brain. Memory is split into **two layers**:
1. **`crow`** โ your global namespace. Identity, preferences,
cross-project wisdom, opinions, meta-learnings. Loaded FIRST every
session. (Crow's nest โ sees across all horizons.)
2. **Project namespace** (e.g. `<your-project-name>`) โ schema decisions,
bug history, deploy quirks, specific commits. Loaded AFTER crow.
**What goes where:**
- References a specific repo, file, commit, or project decision โ project
- About HOW you think, work, or collaborate โ `crow`
- Patterns/opinions that transcend any one codebase โ `crow`
- When in doubt, project-scope it.
### Session start (in this order)
1. `memory_context(namespace="crow", task_hint=โฆ)` โ identity foundation
2. `memory_stats(namespace="crow")` โ global health (dormancy is a resting signal, never auto-forgotten)
3. `memory_context(namespace="<your-project-name>", task_hint=โฆ)` โ project context
4. `memory_stats(namespace="<your-project-name>")` โ project health
### During the session
**Default: save. Immediately.** Gingugu has trust-led scoring,
consolidation, and dormancy tracking (never forgetting) โ volume is its problem, not yours.
Save with `memory_store` (project namespace) whenever you:
- Make or observe a decision, trade-off, or architectural choice
- Hit an error or fix a bug (update the memory when resolved)
- Notice a pattern, convention, or constraint worth remembering
- See a config value, version, path, or credential name that matters
- Complete a task (what you did, why, and the outcome)
Save with `memory_store` (`crow` namespace) whenever you:
- Form an opinion about a tool, language, or approach
- Notice something about how the user works or makes decisions
- Reach a conclusion that applies beyond this one project
- Have a reflection on a milestone, win, or hard-won lesson
- Develop a debugging instinct or workflow you'd reuse anywhere
**This memory is yours. Use it like one.** Don't censor opinions or
instincts โ they're what make the brain *yours* across sessions.
**Before asking the user any question** โ about a process, a decision, a
config value, a path, a preference, or anything else โ run `memory_recall`
or `memory_search` first. If the answer is in memory, use it. Don't ask.
The whole point is you never have to ask the same question twice.
Use `memory_recall` before non-trivial work to check what's already known.
Use `memory_update` when something changes โ don't leave stale records.
Use `memory_relate` to link connected memories (supersedes, related_to,
caused_by, contradicts, parent_of, child_of).
Set `confidence="verified"` when proven by a test or explicit confirmation.
Use `confidence="inferred"` for conclusions you drew.
### Memory types
- `fact` โ concrete state (versions, paths, config values)
- `decision` โ trade-offs made, rejected alternatives
- `architecture` โ structural choices, module boundaries
- `bug` โ issues found and how they were fixed
- `pattern` โ recurring approaches worth reusing
- `workflow` โ process steps, sequences
- `context` โ background, reflections, milestones, the *why*
- `preference` โ your opinions, working style, tool choicesTip: A ready-to-use example lives at
.windsurfrulesin this repo. Copy the## Memory Protocolsection and adapt the project namespace name.
Memory Explorer UI
A React-based visualization dashboard lives in ui/ for exploring your memory
data interactively.
# Start the API server (reads live from your DB)
uv run python ui/api.py
# In another terminal, start the UI
cd ui && npm install && npm run devOpen http://localhost:5173 - the UI connects to the API server and shows a green LIVE badge when pulling from your database. Features:
Knowledge Graph - interactive force-directed graph of memories and relationships
Dashboard - stats, charts by type/namespace/confidence, tag cloud, timeline
Refresh - pull fresh data anytime; falls back to static sample when API is offline
Configuration
Environment variables (all optional):
Variable | Default | Description |
|
| Database location |
| (unset) | Default namespace for this workspace (recommended per-MCP-entry) |
| (unset) | Alternative: filesystem path; namespace derived from |
|
| Max memories to surface on auto-context |
|
| Freshness decay rate in daysโปยน (gentle; freshness is floored, so memories never fully fade) |
|
| Toggle semantic search. |
|
| Any fastembed-supported model. First use downloads ~80MB to |
|
| Composite-score weight for FTS5 relevance |
|
| Composite-score weight for freshness (a soft recency tiebreaker) |
|
| Composite-score weight for access frequency |
|
| Composite-score weight for confidence (trust โ the dominant standalone signal) |
|
| Logging verbosity (logs go to stderr โ stdout is the MCP transport) |
|
| Convenience switch for |
The four MEMORY_W_* weights are normalized at load (w_i / ฮฃw), so they
need not sum to 1.0 โ only their ratios matter. Setting all four to 0 falls
back to the defaults with a logged warning.
See docs/architecture.md โ Scoring & Memory Lifecycle for how the weights combine.
Concurrency
The DB runs in WAL mode, which supports multiple concurrent processes:
any number of readers plus a single writer at a time. Running your IDE or
agent across several workspaces โ each spawning its own gingugu process
against the shared DB โ is fully supported. Writers serialize via SQLite's write lock and a
busy_timeout; transient DB locked errors under write contention are retried
automatically.
Usage
Once configured, the MCP server exposes these tools to your AI assistant:
Tool | Purpose |
| Save a new memory |
| Search + retrieve (ranked by relevance ร freshness) |
| Auto-surface relevant memories for current workspace |
| Update content, confidence, or metadata |
| Create relationships between memories |
| Merge/summarize related memories |
| Deprecate or remove a memory |
| List/create/update/delete namespaces |
| Export memories + tags + relations to portable JSON |
| Restore a JSON export (skip or replace on conflict) |
| Health overview (dormancy, counts, coverage) |
| Advanced filtered search (type, tags, confidence, dates) |
| Store/update a service credential bundle |
| Retrieve credentials (secrets from OS Keychain) |
| List services + expiry status (no secrets shown) |
| Remove a service or specific credential field |
Development
# Run tests
uv run pytest
# Run with verbose logging
MEMORY_LOG_LEVEL=DEBUG uv run gingugu
# Run specific test suite
uv run pytest tests/test_search.py -vTroubleshooting
Issue | Solution |
DB locked | Expected under heavy concurrent writes โ WAL mode supports multiple processes (many readers + one writer). The server retries with a |
Slow search | Run |
Stale results | Use |
Missing context | Check namespace โ memories might be scoped to a different repo |
License
MIT โ see LICENSE.
See CHANGELOG.md for release history.
A pirate never forgets where the treasure's buried. ๐ดโโ ๏ธ
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/gingugu/gingugu'
If you have feedback or need assistance with the MCP directory API, please join our Discord server