Thoth-Mem
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Thoth-Memrecall the auth pattern from last session"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Thoth-Mem
Persistent memory for AI coding agents
Give your AI coding agent a brain that survives across sessions, compactions, and context resets.
Thoth-Mem is an MCP server with an optional HTTP REST API that stores what your agent learns — architecture decisions, bug fixes, patterns, preferences — in a local SQLite database with full-text search. When a new session starts, the agent picks up right where it left off.
Agent Session 1 Agent Session 2
┌─────────────────┐ ┌─────────────────┐
│ discovers auth │──── save ───▶│ recalls auth │
│ uses JWT+refresh │ │ pattern instantly │
│ fixes edge case │──── save ───▶│ avoids same bug │
└─────────────────┘ └─────────────────┘
│ ▲
└──── thoth.db (SQLite) ──────────┘Features
6 compact MCP tools — workflow-level tools instead of one tool per internal view
Dashboard v2 operations console served by the HTTP bridge at
/, with OpenAPI docs preserved at/docsCLI + MCP dual mode — use as a server or directly from the terminal
SQLite + FTS5 full-text search (fast, zero external dependencies)
Git-friendly sync — export memory as gzipped chunks for version control
JSON export/import — portable memory backup and transfer
Project migration — rename projects across all entities in one operation
Knowledge Graph Ledger rebuild — backfill derived KG/ledger facts for existing memories; legacy graph endpoint compatibility is preserved
MCP Server Instructions — built-in protocol guidance for connected agents
Observation versioning — full history preserved on topic_key upserts
Session enrichment — sessions auto-fill missing project/directory on reconnect
Normalized deduplication — whitespace/formatting-insensitive duplicate detection
Strict type taxonomy — observation types enforced at the database level
Paginated retrieval — large observations served in chunks via offset/max_length
Privacy defense —
<private>tags stripped before storageToken-efficient recall — compact fused evidence first, context expansion only when needed
Retrieval and KG eval baselines — deterministic hybrid retrieval and graph-quality benchmarks (lexical, semantic raw/HyDE, KG, compression, lineage, forbidden triples, optional LLM KG acceptance)
Agent-first MCP tools — recall, save, context, project navigation, session lifecycle, and full-content fetch
Admin tools via CLI & HTTP — export, import, sync, and migration available without cluttering the MCP tool surface
Operation trace logging — MCP and HTTP calls persist sanitized request/response traces for dashboard inspection
Quick Start
# Run the MCP server + HTTP bridge directly (no install needed)
npx -y thoth-mem@latest mcp
# Or install globally
pnpm add -g thoth-mem
thoth-mem mcpRequires Node.js >= 18.
For backwards compatibility, running thoth-mem with no subcommand also starts the MCP server and HTTP bridge. New MCP client configs should prefer the explicit mcp subcommand.
MCP Configuration
Claude Code
claude mcp add thoth-mem -- npx -y thoth-mem@latest mcpOpenCode
Add to ~/.config/opencode/config.json:
{
"mcp": {
"thoth": {
"type": "local",
"command": [
"npx",
"-y",
"thoth-mem@latest",
"mcp"
]
}
}
}Gemini CLI
Add to ~/.gemini/settings.json:
{
"mcpServers": {
"thoth": {
"command": "npx",
"args": ["-y", "thoth-mem@latest", "mcp"]
}
}
}CLI Commands
Thoth-Mem also works as a standalone CLI. When no subcommand is given, it starts the MCP server (and HTTP bridge by default).
thoth-mem # Start MCP server + HTTP bridge (default)
thoth-mem mcp # Start MCP server (explicit)
thoth-mem mcp --no-http # Start MCP server without HTTP bridge
thoth-mem search <query> # Search memories
thoth-mem save <title> <content> # Save a memory
thoth-mem timeline <observation_id> # Chronological context around an observation
thoth-mem context # Recent session context
thoth-mem stats # Memory statistics
thoth-mem export [file] # Export to JSON (stdout if no file)
thoth-mem import <file> # Import from JSON
thoth-mem sync [--sync-dir=<path>] # Git sync export
thoth-mem sync-import [--sync-dir=<path>] # Git sync import from another instance
thoth-mem migrate-project <old> <new> # Rename a project across all entities
thoth-mem delete-project <project> # Delete a project and its related data
thoth-mem rebuild-graph --project <name> # Rebuild graph facts for one project
thoth-mem rebuild-graph --all # Rebuild graph facts for every project
thoth-mem rebuild-index --project <name> # Queue semantic index rebuild for one project
thoth-mem rebuild-index --all # Queue semantic index rebuild for all projects
thoth-mem rebuild-index --all --process 500 # Queue and process up to 500 jobs
thoth-mem rebuild-index --status # Show semantic index queue/coverage progress
thoth-mem version # Show version
thoth-mem help # Show helpCommon CLI path/filter flags:
thoth-mem stats --data-dir=/custom/path
thoth-mem search "auth pattern" -p my-project--no-http is a server startup flag, so use it with MCP/server mode:
thoth-mem mcp --no-httpHTTP REST API
Thoth-Mem runs an HTTP REST API bridge alongside the MCP server by default. The bridge listens on port 7438, serves Dashboard v2 at / when dashboard assets are built, and provides full access to memory operations via standard HTTP.
Local dashboard:
Dashboard:
http://localhost:7438/Build assets locally with
pnpm run dashboard:buildduring development or release packaging.If
dist/dashboard/index.htmlis missing,/returns a clear local build message while/docs,/openapi.json, and REST APIs remain available.Dashboard v2 is a local operations console for retrieval lanes, operation traces, indexing/background state, graph exploration, and HTTP/CLI-equivalent commands.
Dashboard deep links live under
/console/*so API routes such as/operationsand/graph/rebuildare never shadowed.The console can create observations and queue rebuild operations through the same REST contracts documented in OpenAPI.

Memory universe D3 graph view with clustered nodes and relationship edges.
Interactive Documentation:
OpenAPI spec:
http://localhost:7438/openapi.jsonInteractive docs:
http://localhost:7438/docs
Disable the HTTP bridge:
thoth-mem mcp --no-http
# or
THOTH_HTTP_DISABLED=true thoth-mem mcpExample: Search memories via HTTP
curl http://localhost:7438/observations/search?query=auth+patternExample: Get memory statistics
curl http://localhost:7438/statsExample: Inspect operation traces
curl "http://localhost:7438/operation-traces?origin=http&status=error&limit=20"Example: Queue an index rebuild
curl -X POST http://localhost:7438/index/rebuild \
-H "content-type: application/json" \
-d '{"project":"my-project","reason":"manual","process_limit":0}'The HTTP API supports sessions, observations, prompts, search, export/import, sync, operation traces, index status/rebuild, graph rebuild, operation catalog, and version inspection. See the interactive /docs interface for the full API reference.
Development Checks
pnpm run build
pnpm run dashboard:typecheck
pnpm run dashboard:build
pnpm test
pnpm run eval:retrieval
pnpm run eval:kgpnpm run eval:retrieval runs a deterministic in-memory hybrid retrieval eval against seeded observations, curated non-synthetic project-documentation examples, and synthetic distractors. It reports hybrid recall under noise, corpus size, direct vs rephrased vs non-synthetic case mix, measured surgical compression, HyDE lift, pending/degraded fallback, lexical prefix behavior, semantic raw vs HyDE contribution, sentence-first small-to-big promotion, KG enrichment, KG-as-primary lane rate, and evidence lineage coverage without requiring model downloads or remote APIs. The default gate now requires every eval case to land at rank 1.
Scale the retrieval eval with THOTH_RETRIEVAL_EVAL_NOISE when you want hundreds or thousands of synthetic distractors. In PowerShell:
$env:THOTH_RETRIEVAL_EVAL_NOISE='250'; pnpm run eval:retrievalpnpm run eval:kg runs a deterministic KG quality eval for subject-relation-object extraction. It reports expected triple recall, forbidden triple rate, long-conversation cases where deterministic extraction should be paired with optional LLM enrichment, and acceptance of validated LLM triples while rejecting unknown relations.
MCP Tools (6)
Tool | Purpose |
| Save observations, prompts, session summaries, or passive learnings |
| Primary fused hybrid recall across semantic, KG, and lexical lanes |
| Get recent context — sessions, prompts, observations, stats |
| Retrieve an observation or prompt by ID, with optional timeline or pagination |
| List projects, summarize one project, inspect graph facts/topics |
| Start, checkpoint, or summarize a memory session |
Current tool/action map:
Tool | Current shape |
|
|
|
|
| Recent sessions/prompts/observations, optionally with |
|
|
|
|
|
|
Legacy one-tool-per-view names are intentionally obsolete and are not registered. Use mem_recall instead of mem_search, mem_get instead of mem_get_observation or mem_timeline, mem_project instead of mem_project_summary, mem_project_graph, or mem_topic_keys, mem_session instead of mem_session_start or mem_session_summary, and mem_save(kind="prompt") instead of mem_save_prompt.
Admin operations (export, import, sync, sync-import, migrate-project, delete-project, rebuild-graph, rebuild-index) are available via the CLI. Export, import, sync, migration, operation traces, index status/rebuild, graph rebuild, operation catalog, and version inspection are also available through the HTTP REST API. They are not registered as MCP tools to keep the agent's tool surface lean.
Retrieval and Embeddings
mem_recallis the primary retrieval tool. Usemode=compactfirst, thenmode=contextfor the strongest hits, andmem_getonly when full content is needed.mem_recallaccepts precision filters forproject,session_id,scope,topic_key,type,time_from, andtime_to; these pass through to all retrieval lanes.Hybrid retrieval defaults use tuned core lane fusion: sentence top-k 100, chunk top-k 20, lexical limit 20, min semantic score 0.3, and lane order
sentence > kg > chunk > lexical. Knowledge-graph facts now participate as a first-class ranking lane and also enrich returned hits with supporting graph evidence.Lexical ranking filters low-signal query stopwords and scores prefix matches by content-term coverage, so a broad one-word overlap cannot outrank stronger semantic/KG evidence under noisy corpora.
Surgical trimming is explicit in
mem_recall mode=context: sentence hits return aprimary_sentenceand, when the score clears the small-to-big threshold, a labeledsurrounding_parent_chunk. Lexical hits return matching sentences instead of whole observations. Each context hit includesretrieval_contract,compression_ratio,evidence_chars, andfull_charsso noise reduction is measured rather than claimed.Semantic indexing is eventual and non-blocking. Save/update operations can return while indexing stays pending in the background. Terminal job failures keep
last_errorandfinished_at, stale running leases are recoverable, and later queued jobs continue processing instead of being starved by failed work.Semantic lane state is reconciled from queue health plus vector coverage on startup/status reads, so completed lanes do not remain stuck in
pendingafter a restart when queue and coverage are clean./viz/health,/observatory/health, and/index/statusinclude product telemetry for semantic lanes, job totals, job-kind breakdowns, queue lag, vector coverage ratios, and recent indexing/KG warnings. Optional KG LLM failures are recorded as job telemetry while deterministic KG extraction still completes.Automatic rebuild is triggered when embedding configuration hash changes; manual rebuild is available through
thoth-mem rebuild-index --project <name>,thoth-mem rebuild-index --all, andPOST /index/rebuild. Usethoth-mem rebuild-index --statusorGET /index/statusto inspect queue progress, lane state, recent errors, and vector coverage.When semantic lanes are pending or unavailable, retrieval degrades safely to lexical recall with graph enrichment where matching facts exist, and reports fallback metadata (
pending,degraded_fallback) instead of failing.sqlite-vecis optional at runtime: if unavailable, Thoth-Mem marks semantic lanes degraded and continues serving lexical retrieval with KG enrichment.Local embeddings default to provider
transformers_localand modelnomic-ai/nomic-embed-text-v1.5unless overridden.HyDE is enabled by default. The local fallback uses Transformers.js text generation with
onnx-community/Qwen2.5-Coder-0.5B-Instruct; remote HyDE can use Ollama or an OpenAI-compatible LM Studio server.KG extraction is deterministic-first. Optional LLM enrichment can be enabled for long conversations with Ollama or LM Studio; generated triples are filtered through the same relation taxonomy and merged with deterministic triples. If the remote extractor is disabled or unavailable, deterministic KG extraction still completes.
Recommended Embedding Models
Model choice affects vector dimensions, quality, memory use, and index compatibility. Keep the same provider/model/dimensions for an existing semantic index; changing them marks embeddings stale and queues a rebuild.
Use case | Ollama model | LM Studio model to look for | Notes |
Lightweight local default | Good first choice for local RAG. Small download, mature support, 768-dimensional embeddings in the upstream model card. | ||
Strong general retrieval | Good quality/performance balance for English and technical notes. Use the exact loaded model id shown by LM Studio. | ||
Multilingual / Spanish-heavy memory | Strong multilingual option. The upstream model card highlights 100+ languages and inputs up to 8192 tokens. | ||
Higher-quality modern option | Better multilingual/code retrieval when you can spend more RAM/CPU than |
Ollama example:
ollama pull bge-m3
THOTH_EMBEDDING_PROVIDER=ollama \
THOTH_EMBEDDING_BASE_URL=http://127.0.0.1:11434 \
THOTH_EMBEDDING_MODEL=bge-m3 \
thoth-memLM Studio example:
# In LM Studio, load an embedding-capable model and start the local server.
# Use the exact model id shown by LM Studio for THOTH_EMBEDDING_MODEL.
THOTH_EMBEDDING_PROVIDER=lmstudio \
THOTH_EMBEDDING_BASE_URL=http://127.0.0.1:1234 \
THOTH_EMBEDDING_MODEL=nomic-embed-text-v1.5 \
thoth-memTHOTH_EMBEDDING_DIMENSIONS is inferred for known models such as nomic-ai/nomic-embed-text-v1.5 and nomic-embed-text. Set it explicitly when using a custom model or when the selected runtime supports a stable dimension override and you want to force a specific sqlite-vec table shape.
Recommended HyDE Models
HyDE needs a generative/instruct model, not an embedding model. It writes a short hypothetical answer to the recall query; Thoth-Mem embeds both the raw query and the HyDE answer as separate semantic inputs.
Use case | Provider/model | Notes |
Default local fallback |
| Small ONNX model for local text generation. Good enough for short retrieval hints and code-heavy memories; loaded with |
Ollama code-heavy memory | Recommended 7B-class local model for coding-agent memory and technical HyDE prompts. | |
Ollama general/multilingual | Better general-purpose choice when memory is not mostly code. | |
LM Studio code-heavy memory | In LM Studio, use the exact model id shown in the Developer panel, often from an | |
LM Studio general fallback | Strong general 8B-class option if you already have Llama available locally. |
Example with LM Studio embeddings and LM Studio HyDE:
{
"embedding": {
"provider": "lmstudio",
"model": "text-embedding-nomic-embed-text-v1.5@q8_0",
"baseUrl": "http://127.0.0.1:1234",
"dimensions": 768
},
"hyde": {
"enabled": true,
"provider": "lmstudio",
"model": "loaded_model",
"baseUrl": "http://127.0.0.1:1234/v1",
"timeoutMs": 4000
}
}Optional KG LLM Enrichment
KG extraction defaults to the deterministic extractor. To enrich long observations, enable the local Transformers.js provider or a remote local model provider and set the minimum content length that should trigger the LLM pass:
THOTH_KG_LLM_ENABLED=true \
THOTH_KG_LLM_PROVIDER=transformers_local \
THOTH_KG_LLM_MODEL=onnx-community/Qwen2.5-Coder-0.5B-Instruct \
THOTH_KG_LLM_MIN_CONTENT_CHARS=12000 \
thoth-memOllama remains available for remote local generation:
THOTH_KG_LLM_ENABLED=true \
THOTH_KG_LLM_PROVIDER=ollama \
THOTH_KG_LLM_BASE_URL=http://127.0.0.1:11434 \
THOTH_KG_LLM_MODEL=qwen2.5:7b-instruct \
THOTH_KG_LLM_MIN_CONTENT_CHARS=12000 \
thoth-memLM Studio uses the OpenAI-compatible chat completions endpoint:
THOTH_KG_LLM_ENABLED=true \
THOTH_KG_LLM_PROVIDER=lmstudio \
THOTH_KG_LLM_BASE_URL=http://127.0.0.1:1234/v1 \
THOTH_KG_LLM_MODEL=loaded_model \
thoth-memThe LLM pass is an enrichment step, not the source of truth: invalid relation names are discarded, duplicate triples are deduped, and KG jobs continue with deterministic triples if the remote request fails.
Sync & Portability
JSON Export/Import
Full memory backup in a single JSON file:
# Export everything
thoth-mem export backup.json
# Export one project
thoth-mem export --project=my-app backup.json
# Import (duplicates are skipped via sync_id)
thoth-mem import backup.jsonGit Sync
Incremental, append-only gzipped chunks designed for version control — no merge conflicts:
# Export a chunk to the sync directory
thoth-mem sync --sync-dir=.thoth-sync
# Structure created:
# .thoth-sync/
# manifest.json ← ordered chunk list
# chunks/
# <timestamp>.json.gz ← compressed memory chunkImport on another machine:
thoth-mem sync-import --sync-dir=.thoth-syncEach observation and prompt carries a sync_id (UUID) that prevents duplicates on re-import.
Incremental exports: Only changes since the last sync are exported, tracked via mutation journal for efficiency.
Tombstones: Deleted observations propagate correctly across synced instances, ensuring consistency.
Replay safety: Re-importing the same data is safe; duplicates are detected and skipped automatically via sync_id.
Project Migration
Rename a project across every entity in one transaction:
thoth-mem migrate-project old-name new-nameUpdates sessions, observations, and prompts atomically.
Project Deletion
Delete a project and its related data safely:
thoth-mem delete-project project-nameThis runs as a transaction, blocks deletion if shared sessions or data are detected in another project, and keeps sync tombstones consistent.
Configuration
On startup, Thoth-Mem creates ~/.thoth/config.json if it does not exist and backfills missing keys when the file is partial. The config starts with a public $schema URL served from the published package on unpkg; the schema source lives in this repo at config.schema.json. Environment variables override config file values at runtime, but they are not written back to the file.
Default editable config:
{
"$schema": "https://unpkg.com/thoth-mem@0.3.1/config.schema.json",
"version": 1,
"maxContentLength": 100000,
"maxContextResults": 20,
"maxSearchResults": 20,
"dedupeWindowMinutes": 15,
"previewLength": 300,
"http": {
"port": 7438,
"disabled": false
},
"retrievalDefaults": {
"sentenceTopK": 100,
"chunkTopK": 20,
"lexicalLimit": 20,
"minSemanticScore": 0.3,
"l2DistanceScale": 20
},
"embedding": {
"provider": "transformers_local",
"model": "nomic-ai/nomic-embed-text-v1.5",
"baseUrl": null,
"dimensions": 768
},
"hyde": {
"enabled": true,
"provider": "transformers_local",
"model": "onnx-community/Qwen2.5-Coder-0.5B-Instruct",
"baseUrl": null,
"timeoutMs": 4000
},
"kgLlm": {
"enabled": false,
"provider": "transformers_local",
"model": "onnx-community/Qwen2.5-Coder-0.5B-Instruct",
"baseUrl": null,
"timeoutMs": 8000,
"minContentChars": 12000
}
}Environment Variable | Default | Description |
|
| Data directory for SQLite database |
|
| Max content length (warns, never truncates) |
|
| Max observations in context response |
|
| Max search results returned |
|
| Rolling deduplication window |
|
| Search result preview length |
|
| HTTP REST API port |
|
| Disable HTTP REST API bridge |
|
| Embedding provider ( |
|
| Embedding model id |
| provider-specific | Base URL for remote/local API providers |
| inferred for known models | Optional embedding dimensions override |
|
| Enable HyDE dual-input semantic query expansion |
|
| HyDE generation provider ( |
|
| HyDE generation model id |
| unset | Optional HyDE provider base URL |
|
| HyDE timeout before raw-query-only fallback |
|
| Enable optional LLM KG enrichment for long observations |
|
| KG LLM provider ( |
|
| KG LLM model id |
| unset | KG LLM provider base URL for remote providers |
|
| KG LLM timeout before deterministic-only fallback |
|
| Minimum observation size that triggers LLM enrichment |
Storage
All data lives in a single SQLite database at ~/.thoth/thoth.db (configurable via THOTH_DATA_DIR or --data-dir).
WAL journal mode for concurrent read performance
FTS5 full-text search over observations and prompts
Foreign keys + CHECK constraints for data integrity
Automatic schema migrations for seamless upgrades
Observation Types
Observations are categorized with an enforced taxonomy:
Type | Use for |
| Architecture or design choices |
| System structure and patterns |
| Bug fixes and root causes |
| Established conventions |
| Configuration and environment setup |
| Non-obvious findings about the codebase |
| General learnings and gotchas |
| End-of-session summaries |
| Anything that doesn't fit above |
License
MIT
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/EremesNG/thoth-mem'
If you have feedback or need assistance with the MCP directory API, please join our Discord server