What can you do with this server?

M3 Memory is a persistent, bitemporal, hybrid-search knowledge base MCP server for AI agents to store, retrieve, and manage memories, chat logs, files, tasks, and agents across sessions. Memory Management * Store facts, preferences, and configurations with automatic contradiction detection and supersession (memory_write) * Hybrid semantic (vector) + keyword (BM25/FTS5) search with filtering by scope, type, user, and time (memory_search) * Retrieve by UUID or prefix (memory_get), or explicitly replace memories while retaining history (memory_supersede) * Bitemporal as_of queries to reconstruct past system state Chat Log Management * Append chat turns with full provenance (agent, model, provider, cost) via chatlog_write * Search logs by keyword, agent, model, provider, or time range (chatlog_search) * Health/status summary of the chat subsystem (chatlog_status) File Ingestion & Search * Ingest, chunk, and index files into corpora; search with hybrid FTS5+vector (files_search) * File-level summaries (files_index), corpus statistics (files_stats), integrity checks (files_health), and corpus listing (files_corpus_list) Agent & Task Tracking * List registered agents filtered by status/role (agent_list) * List tasks filtered by state, owner, or parent (task_list) Tool Discovery & Meta-Operations * Browse the full tool catalog (m3_index), discover capabilities by domain or keyword (m3_help_capabilities) * Dynamically load tool domains on demand (tools_load_domain) — only ~18 core tools load at startup to minimize token usage * Invoke any catalog tool by name without pre-loading its domain, including batch calls up to 100 (m3_call) Key Features * Hybrid retrieval: vector similarity, BM25, MMR diversity, and reranking * Automatic contradiction resolution on write * Multi-database/multi-tenant support via a database parameter * Supports offline/air-gapped deployments with local embeddings; storage via SQLite or PostgreSQL/ChromaDB

Which integrations are available for this server?

Optionally uses Ollama to load a small chat model for auto-classification, summarization, and consolidation of memories. Enables syncing memory data across devices using PostgreSQL, allowing seamless continuation between different machines. Uses local SQLite databases as the primary storage for memories, chat logs, files, and knowledge graph, ensuring sovereign data storage.

How do I use M3 Memory?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@M3 Memory save this conversation about project requirements" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

🧠 M3 Memory

M3 treats agent memory as a distributed-systems infrastructure problem, not a simple retrieval feature.

Instead of every tool keeping its own throwaway context, M3 is a shared, evolving, bitemporal knowledge base that multiple heterogeneous agents and machines read and write. It is designed to solve a fundamental challenge: How do agents maintain a consistent, evolving, and temporal knowledge base over months and years?

Plugs into your stack — framework and database. M3 brings contradiction-aware, bitemporal, locally-embedded memory to the tools you already use, and scales from a zero-setup file to a shared server:
🖥️ Web dashboard — a built-in, backend-agnostic control panel (default http://127.0.0.1:8088): browse memory, explore the interactive knowledge graph, and watch system health / load. pip install m3-memory[dashboard] then m3 dashboard. (See Dashboard Guide)
🐘 PostgreSQL — run M3 on a first-class PostgreSQL primary backend (M3_DB_BACKEND=postgres) for a shared, server-hosted store, with cross-device sync to a PostgreSQL warehouse. SQLite stays the zero-infrastructure default. (See Architecture · Sync)
🦜 LangChain & LangGraph — drop-in Mem0 replacement (one-line import swap) and fully LangMem-compatible (store=M3Store()): pip install m3-memory[langchain]. (See LangChain Guide)
👥 CrewAI — a drop-in StorageBackend for CrewAI's unified memory: pip install m3-memory[crewai]. (See CrewAI Guide)
🧩 PydanticAI — wire M3 in as the agent's memory layer: pip install m3-memory[pydantic-ai]. (See PydanticAI Guide)
All four gain automatic contradiction supersession, bitemporal historical queries, local sovereign embedding, and the full 100+ MCP tool set.

🚀 Quick Links & Badges

💡 Get Started Quickly:
🚀 5-Minute "Human-First" Guide
🖥️ OS Installation: Windows Setup · macOS Setup · Linux Setup

Related MCP server: Smriti

📑 Table of Contents

⚡ M3 at a Glance

Feature	Details
Works With	Claude Code · Gemini CLI · Aider · Google Antigravity · OpenCode · Hermes · LangChain/LangGraph · CrewAI · PydanticAI · Any MCP Agent
M3 Is	A persistent memory layer · An MCP server · A hybrid retrieval engine · A bitemporal knowledge base
M3 Is Not	An LLM · A chatbot · A plain vector database · A RAG framework · An IDE
Core Promise	Private, offline-capable, locally owned memory shared securely across all your developer tools — with FIPS 140-3-ready crypto and atomic multi-agent writes for regulated and multi-agent environments.
Retrieval Accuracy	State-of-the-art for a local-first substrate — 99.2% session-hit-rate @ k=10, 100% @ k=20 on LongMemEval-S (no oracle routing), with the correct session as the #1 result for ~92% of questions. See Benchmarks.
Context Efficiency	Exposes 100+ tools but occupies just ~1.8% of a 200K context window at startup — lazy domain-gating loads the rest on demand.
Maturity	Stable, battle-tested core engine (2,179 tests) that's safe to build on today; new features and integrations are added actively. SQLite by default; PostgreSQL as a first-class primary backend (`M3_DB_BACKEND=postgres`) via a pluggable SQL storage seam. (See features.json)

🧠 Memory Model at a Glance

M3 is a typed, bitemporal, confidence-scored, self-maintaining knowledge base. Every feature listed below is implemented natively (see Memory Model Details):

Structured Metadata: Every memory contains a type, source, confidence, scope, provenance (change_agent), and salience (importance, decay_rate).
Verbatim, Non-Destructive Storage: Memory content is stored exactly as written and never altered in place — the raw text is always retrievable byte-for-byte. Corrections don't overwrite: a superseded fact is closed (its validity interval ends) and the new fact is linked to it, so both the original wording and its full edit history stay queryable. You get true verbatim recall and an audit trail, not one or the other.
Bitemporal History: Distinguishes valid-time from transaction-time. Because superseded facts are closed rather than deleted, you can query what the agent believed at any specific point in time.
Contradiction Management: Conflicting facts are resolved automatically on write. The stale fact is marked as superseded, and confidence values are updated dynamically via Bayesian confidence posteriors.
Self-Maintaining Lifecycle: Implements memory decay, deduplication, automatic consolidation into higher-order beliefs, TTL expiry, and GDPR erasure.
Procedural Memory: A first-class procedure type (skill / runbook / how-to / checklist) that is auto-distilled from successful task runs — the background loop rolls up a completed task and its step/result memories into a reusable, step-by-step procedure, preserved with distills_from provenance back to its sources. A "how do I…" query surfaces it via a procedural retrieval boost.
Write-Gating & Content Safety: Filters out low-signal noise via an enrichment queue and content safety guardrails before storage.
Explainable Retrieval: Hybrid engine combining vector similarity, BM25 (FTS5), MMR diversity, and reranking. memory_suggest returns the exact score breakdown per result. (See Confidence and Trust Guide).
Proven Accuracy: On LongMemEval-S, M3 delivers state-of-the-art retrieval for a local-first substrate — 99.2% session-hit-rate @ k=10 and 100% @ k=20 (no oracle routing), with the correct session as the #1 result for ~92% of questions. End-to-end QA accuracy is 92.0% with no oracle metadata (see Benchmarking Report).

📦 Installation

The One-Liner (macOS & Linux)

curl -fsSL https://raw.githubusercontent.com/skynetcmd/m3-memory/main/install.sh | bash

For Windows, please follow the Windows Manual Installation Guide.
To install manually on any platform, refer to the OS-Specific Install Instructions or examine the installer script.

Developer Setup Wizard

If you are developing inside python environments:

pip install m3-memory
m3 setup

The m3 setup wizard automatically scans your PATH for active agents (Claude Code, Gemini CLI, OpenCode, OpenClaw), installs settings files/hooks, provisions the sovereign CPU embedder, and performs a system diagnostic.

Integrating with AI Coding Tools

🤖 Claude Code

Install as a plugin to unlock /m3:* slash commands, curation subagents, and automatic hooks:

/plugin marketplace add skynetcmd/m3-memory
/plugin install m3@skynetcmd

See Claude Code Plugin Reference and Claude.ai Connector Guide.

🪐 Google Antigravity

Install the plugin directly:

agy plugin install https://github.com/skynetcmd/m3-memory

See Antigravity Plugin Reference.

🦊 Hermes Agent

Run the wizard to automatically wire up optimal memory providers:

m3 setup

See Hermes Plugin Integration Guide.

🐍 Python / LangChain & LangGraph

Use M3 as a drop-in Mem0 replacement or LangMem backend:

pip install m3-memory[langchain]

See LangChain Integration Guide.

👥 CrewAI (v1.x)

A drop-in StorageBackend for CrewAI's unified memory:

pip install m3-memory[crewai]   # crewai>=1.10,<2 · Python 3.10–3.13 (a 3.14 escape hatch is documented)

See CrewAI Integration Guide.

🧩 PydanticAI

m3 tools + auto-recall, or a formal M3MemoryToolset. Built on Pydantic v2 — runs natively on Python 3.14:

pip install m3-memory[pydantic-ai]   # pydantic-ai-slim>=2,<3

See PydanticAI Integration Guide.

Manual MCP Server Configuration

To expose M3 to any Model Context Protocol host, add it to your configuration file:

{
  "mcpServers": {
    "memory": {
      "command": "m3"
    }
  }
}

🎚️ Domain Gating: the Full Catalog Without the Context Cost

M3 gives you the full 100+ tool surface while occupying just 1.8% of a 200K context window at startup — most MCP servers make you pay for every tool in every prompt. Tools are grouped into 9 domains (memory, chatlog, files, entity, agent, tasks, conversations, diagnostics, admin) and loaded lazily.

Only the essential core set (~18, ~3,540 tokens) registers at startup. When your agent needs advanced functionality, it calls tools_load_domain(domain="...") to fetch the rest on demand — so a large catalog costs near-zero context until you actually use a domain.

Gating Mode	Registered Tools	Tokens in Schema	% of 200K Window
Lazy (Default)	~18	~3,540	1.8%
Typical Active Session	64	~17,975	9.0%
Eager Mode (`M3_TOOLS_LAZY=0`)	109	~24,918	12.5%

🛠️ Note: If your client does not support dynamic tool registration, set the environment variable M3_TOOLS_LAZY=0 to register all tools eagerly.

🛡️ Sovereign & Air-Gapped Deployments

M3 operates completely offline by default.

Sovereign Local Embedder

A high-performance BGE-M3 embedder runs locally after installation.

Default: in-process via the m3-core-rs native module (llama.cpp linked in-process, zero IPC — not a separate service you have to run or monitor). CPU execution using GGUF format (_assets/models/bge-m3-Q4_K_M.gguf). A local HTTP embed server on 127.0.0.1:8082 exists only as an automatic fallback if the in-process path can't load.
Hardware Acceleration (GPU): Execute m3 embedder install-gpu to compile with CUDA, Vulkan, or Metal.
External Provider Fallback: Set EMBED_BASE_URL to route requests to Ollama, LM Studio, or vLLM.

Rust-Oxidized Performance Core

M3 includes an optional Rust performance module (m3_core_rs) that speeds up MMR re-ranking, batch cosine distance calculations, and FTS compilations by 90× to 800×. If absent, M3 falls back to pure Python execution automatically. Disable with M3_CORE_RS_DISABLE=1. (See Oxidation Benchmarks).

Enterprise Security & Compliance

FIPS 140-3 Ready: Standardized encryption pathways allow routing through validated cryptographic modules (e.g., wolfSSL via M3_FIPS_MODE=1).
Air-Gapped Install: Supports installation without internet access via pre-compiled python wheels. (See Sovereign Deployment Guide & FIPS Boundary Reference).
Storage Location: All config and data files reside under ~/.m3-memory (configurable via M3_MEMORY_ROOT).

🔮 What M3 Does

Memory Persistence: Saves system architecture, project decisions, and preferences across tool boundaries using a local SQLite database.
Autonomous Cognitive Loop: Background worker (m3_cognitive_loop.py) that periodically sweeps chat logs to extract facts, reconcile contradictions, and construct an entity relationship graph.
Hybrid Vector & Keyword Search: Seamlessly merges vector space, Full-Text Search (FTS5 BM25), and MMR diversity.
Hierarchical File Ingestion: A dedicated 26-tool files domain reads directories, chunks files, extracts facts, and reviews staleness — with ~4× faster incremental re-ingest (unchanged sections reuse cached embeddings).
Verbatim Chatlog Capture: A dedicated 10-tool chatlog domain records conversation turns before compaction, so prior Claude/Gemini sessions stay searchable and nothing is lost to context-window truncation.
Pluggable Storage Backend: SQLite by default; select PostgreSQL as a first-class primary store with M3_DB_BACKEND=postgres. Same semantics on either backend — the choice doesn't change behavior.
Cross-Device Sync: Optionally sync/federate to a PostgreSQL warehouse tier. Access the same memories on your laptop, desktop, or cloud environments.

📚 Documentation Index

Quick & Core	Advanced & Architecture	Integrations & Compliance
🚀 Getting Started Guide	🏗️ System Architecture	🧩 LangChain/LangGraph
✨ Core Features	🔧 Technical Implementation	🧩 Hermes Agent
⚙️ Environment Variables	🧠 Memory Model Guide	🛡️ Compliance Guide (GDPR, FISMA)
🛠️ Operations Playbook	⚡ Rust Oxidation benchmarks	🛡️ FIPS Cryptographic Boundary
🤖 Agent Instructions & Rules	🔍 Myths & Facts Guide	🏠 Homelab Patterns
🧩 Tool Capability Matrix	🤖 AI Context Injection Profile	🔢 Machine-Readable Features

🎯 Who This Is For

M3 is a great fit if...

You use multiple desktop coding agents: Interoperate Claude Code, Gemini, and Aider on a shared local history.
You build with LangChain/LangGraph: An advanced replacement for standard memory models, adding bitemporal queries, contradiction management, and local embeddings.
You build with CrewAI (v1.10–1.x): A drop-in StorageBackend (Memory(storage=M3StorageBackend(user_id="crew-alpha"))) that gives CrewAI bitemporal recall, contradiction-aware supersession, and local embeddings — plus the thing single-vector stores can't do: a CrewAI-written memory can also be searchable by every other m3 agent (Claude Code, Gemini, LangChain) if you want. pip install m3-memory[crewai]. See the CrewAI integration guide.
You build with PydanticAI: m3-backed memory as either drop-in tools + auto-recall (register_m3_tools, m3_recall_processor) or a formal M3MemoryToolset (a real PydanticAI AbstractToolset). Built on Pydantic v2, so it runs on Python 3.14 with a plain pip install m3-memory[pydantic-ai]. See the PydanticAI integration guide.
You need security and compliance: Built-in gdpr_forget and gdpr_export tools, air-gapped support, and audit logs.
You value privacy: Zero external cloud requests or subscriptions required.

M3 is NOT a fit if...

You need a hosted SaaS dashboard with managed infrastructure (use Letta).
You only want transient in-session chat context that resets when you exit the terminal (rely on your agent's defaults).
Your need is only contextual retrieval + a little user state: if plain conversation history, RAG over a knowledge base, and a small structured user profile cover you, that's simpler to build and operate — persistent evolving memory earns its keep when users interact repeatedly over time and benefit from accumulated context.
You want a hosted/managed database as the system of record: M3 is local-first. It can use PostgreSQL as its primary store (M3_DB_BACKEND=postgres) for scale or multi-user deployments, but it's designed to run on your own infrastructure (a local SQLite file by default, or a Postgres you operate) — not against a managed cloud DB you don't control.

🛡️ Why Trust This

Benchmarked Retrieval: State-of-the-art for a local-first substrate — 99.2% session-hit-rate @ k=10, 100% @ k=20 on LongMemEval-S — with a published, reproducible methodology and no oracle routing. See Benchmarks.
Robust Coverage: Verified with 2,179 tests across 180 test files spanning search, sync, GDPR lifecycle, and files ingestion — run with warnings-as-errors, so a new warning fails the suite.
Audit Reports: Regular vulnerability reports (Bandit, secrets scans, pip-audit) published directly under docs/audits/.
Explainable Retrieval: No black-box queries; retrieval math is open, readable, and scoring parameters are outputted directly.
Open Source: Apache 2.0 licensed, free, with no SaaS walls or usage limits.

📊 Benchmarks

Retrieval Recall (Session Hit-Rate @ k)

Evaluated on the 500-question LongMemEval-S dataset under default server configurations:

Retrieve Depth (k)	Session Hit-Rate (SHR)	Success Count	vs. Prior Version
5	98.2%	491 / 500	+2.0pp
10 (Default)	99.2%	496 / 500	+2.4pp
20	100.0%	500 / 500	First Report

End-to-End QA Accuracy

92.0% accuracy (460/500 correct responses) with zero oracle metadata routing:

Question Domain	Count (n)	Accuracy
single-session-user	70	94.3%
single-session-assistant	56	96.4%
single-session-preference	30	80.0%
multi-session	133	87.2%
temporal-reasoning	133	95.5%
knowledge-update	78	93.6%
Overall Summary	500	92.0%

Methodology and reproducibility details are located in the LongMemEval-S Benchmarking Report.

🧰 Core Tools

While M3 features 100+ tools, these five serve as your primary interface:

Tool Name	Operation Description
`memory_write`	Save a specific fact, project preference, or technical configuration.
`memory_search`	Run hybrid keyword (BM25) and semantic vector search.
`memory_update`	Edit existing facts to keep memory accurate.
`memory_suggest`	Query memories alongside a mathematically explicit score breakdown.
`memory_get`	Fetch details of a single memory using its unique ID.

Refer to the Agent Instructions Guide and Full MCP Tool Catalog for complete parameter definitions.

🤖 For AI Agents

You can drop the agent ruleset file examples/AGENT_RULES.md into your workspace to teach your agent best practices (e.g., query before writing, update existing records instead of duplicating).

Command Installation Prompts

Copy and paste these prompts into your terminal client to let your agent set up M3 for you:

Claude Code Prompt

Install m3-memory for persistent memory. Run: pip install m3-memory
Then add {"mcpServers":{"memory":{"command":"m3"}}} to my
~/.claude/settings.json under "mcpServers". For best retrieval, ensure
Ollama is running with qwen3-embedding:0.6b (optional, falls back
to keyword search without it). Then use /mcp to verify the memory server loaded.

Gemini CLI Prompt

Install m3-memory for persistent memory. Run: pip install m3-memory
Then add {"mcpServers":{"memory":{"command":"m3"}}} to my
~/.gemini/settings.json under "mcpServers". For best retrieval, ensure
Ollama is running with qwen3-embedding:0.6b (optional, falls back
to keyword search without it).

Active Chatlog Capture Plugin

To configure instant conversation logging and backup, tell your active coding agent:

Install the m3-memory chat log subsystem.

The agent executes bin/chatlog_init.py and configures execution triggers (see Chat Log Architecture Guide).

🎬 See it in action

Contradiction Detection & Automatic Resolution

Hybrid Search Scoring Details

Multi-Device Database Sync

💬 Community

Discord Badge GitHub Issues Badge

How to Contribute · Good First Issues

📜 License & Attributions

This project is licensed under the Apache License 2.0. See LICENSE for details.

Asset & Icon Credits

The provider badges under docs/badges/ embed small logo glyphs:

OpenClaw & OpenCode icons are from the MIT-licensed LobeHub icon set (lobe-icons).
The Hermes badge uses a generic caduceus glyph.

See NOTICE for the full third-party attribution list.

⭐ Star History

Chart regenerated on a schedule by .github/workflows/star-history.yml using the repo's own token — no third-party embed. Click through for the live interactive version.

Install Server

A

license - permissive license

A

quality

C

maintenance

How are these scores calculated?

Maintenance

–Maintainers

1dResponse time

1dRelease cycle

70Releases (12mo)

Commit activity

Issues opened vs closed

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

View all tools

Guide	Guide	Guide
🗺️ Roadmap	🔄 Cross-Device Sync	👥 Multi-Agent Orchestration
⚖️ Comparison vs Alternatives	❓ FAQ	🔐 Security Policy
🩹 Troubleshooting	⌨️ CLI Reference	📖 API Reference
📁 Files Memory	💬 Chat Log Subsystem	✨ Enrichment Guide
⬆️ Upgrade Guide	🩺 Health FAQ	🧬 Dual Embedding
📜 Changelog	🤝 Code of Conduct	🏗️ Build Wheels