LLMs forget everything. Every conversation starts from zero. Existing memory solutions — Mem0, Zep, Cognee — require LLM calls for basic recall, adding latency, cloud dependency, and cost to every operation.
Aura gives your AI agent persistent, hierarchical memory that decays, consolidates, and evolves — like a human brain. One pip install, works fully offline.
pip install aura-memoryfrom aura import Aura, Level
brain = Aura("./agent_memory")
brain.store("User prefers dark mode", level=Level.Identity, tags=["ui"])
brain.store("Deploy to staging first", level=Level.Decisions, tags=["workflow"])
context = brain.recall("user preferences") # <1ms — inject into any LLM promptYour agent now remembers. No API keys. No embeddings. No config.
⭐ If AuraSDK is useful to you, a
Why Aura?
Aura | Mem0 | Zep | Cognee | Letta/MemGPT | |
LLM required | No | Yes | Yes | Yes | Yes |
Recall latency | <1ms | ~200ms+ | ~200ms | LLM-bound | LLM-bound |
Works offline | Fully | Partial | No | No | With local LLM |
Cost per operation | $0 | API billing | Credit-based | LLM + DB cost | LLM cost |
Binary size | ~3 MB | ~50 MB+ | Cloud service | Heavy (Neo4j+) | Python pkg |
Memory decay & promotion | Built-in | Via LLM | Via LLM | No | Via LLM |
Trust & provenance | Built-in | No | No | No | No |
Encryption at rest | ChaCha20 + Argon2 | No | No | No | No |
Language | Rust | Python | Proprietary | Python | Python |
Performance
Benchmarked on 1,000 records (Windows 10 / Ryzen 7):
Operation | Latency | vs Mem0 |
Store | 0.09 ms | ~same |
Recall (structured) | 0.74 ms | ~270× faster |
Recall (cached) | 0.48 µs | ~400,000× faster |
Maintenance cycle | 1.1 ms | No equivalent |
Mem0 recall requires an embedding API call (~200ms+) + vector search. Aura recall is pure local computation.
How Memory Works
Aura organizes memories into 4 levels across 2 tiers. Important memories persist, trivial ones decay naturally:
CORE TIER (slow decay — weeks to months)
Identity [0.99] Who the user is. Preferences. Personality.
Domain [0.95] Learned facts. Domain knowledge.
COGNITIVE TIER (fast decay — hours to days)
Decisions [0.90] Choices made. Action items.
Working [0.80] Current tasks. Recent context.One call runs the full lifecycle — decay, promote, merge duplicates, archive expired:
report = brain.run_maintenance() # 8 phases, <1msKey Features
Core Memory Engine
RRF Fusion Recall — Multi-signal ranking: SDR + MinHash + Tag Jaccard (+ optional embeddings)
Two-Tier Memory — Cognitive (ephemeral) + Core (permanent) with decay, promotion, and archival
Background Maintenance — 8-phase lifecycle: decay, reflect, insights, consolidation, archival
Namespace Isolation —
namespace="sandbox"keeps test data invisible to production recallPluggable Embeddings — Optional 4th RRF signal: bring your own embedding function
Trust & Safety
Trust & Provenance — Source authority scoring: user input outranks web scrapes, automatically
Source Type Tracking — Every memory carries provenance:
recorded,retrieved,inferred,generatedAuto-Protect Guards — Detects phone numbers, emails, wallets, API keys automatically
Encryption — ChaCha20-Poly1305 with Argon2id key derivation
Adaptive Memory
Feedback Learning —
brain.feedback(id, useful=True)boosts useful memories, weakens noiseSemantic Versioning —
brain.supersede(old_id, new_content)with full version chainsSnapshots & Rollback —
brain.snapshot("v1")/brain.rollback("v1")/brain.diff("v1","v2")Agent-to-Agent Sharing —
export_context()/import_context()with trust metadata
Enterprise & Integrations
Multimodal Stubs —
store_image()/store_audio_transcript()with media provenancePrometheus Metrics —
/metricsendpoint with 10+ business-level counters and histogramsOpenTelemetry —
telemetryfeature flag with OTLP export and 17 instrumented spansMCP Server — Claude Desktop integration out of the box
WASM-Ready —
StorageBackendtrait abstraction (FsBackend+MemoryBackend)Pure Rust Core — No Python dependencies, no external services
Quick Start
Trust & Provenance
from aura import Aura, TrustConfig
brain = Aura("./data")
tc = TrustConfig()
tc.source_trust = {"user": 1.0, "api": 0.8, "web_scrape": 0.5}
brain.set_trust_config(tc)
# User facts always rank higher than scraped data in recall
brain.store("User is vegan", channel="user")
brain.store("User might like steak restaurants", channel="web_scrape")
results = brain.recall_structured("food preferences", top_k=5)
# -> "User is vegan" scores higher, alwaysPluggable Embeddings (Optional)
from aura import Aura
brain = Aura("./data")
# Plug in any embedding function: OpenAI, Ollama, sentence-transformers, etc.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
brain.set_embedding_fn(lambda text: model.encode(text).tolist())
# Now "login problems" matches "Authentication failed" via semantic similarity
brain.store("Authentication failed for user admin")
results = brain.recall_structured("login problems", top_k=5)Without embeddings, Aura falls back to SDR + MinHash + Tag Jaccard — still fast, still effective.
Encryption
brain = Aura("./secret_data", password="my-secure-password")
brain.store("Top secret information")
assert brain.is_encrypted() # ChaCha20-Poly1305 + Argon2idNamespace Isolation
brain = Aura("./data")
brain.store("Real preference: dark mode", namespace="default")
brain.store("Test: user likes light mode", namespace="sandbox")
# Recall only sees "default" namespace — sandbox is invisible
results = brain.recall_structured("user preference", top_k=5)Cookbook: Personal Assistant That Remembers
The killer use case: an agent that remembers your preferences after a week offline, with zero API calls.
See examples/personal_assistant.py for the full runnable script.
from aura import Aura, Level
brain = Aura("./assistant_memory")
# Day 1: User tells the agent about themselves
brain.store("User is vegan", level=Level.Identity, tags=["diet"])
brain.store("User loves jazz music", level=Level.Identity, tags=["music"])
brain.store("User works 10am-6pm", level=Level.Identity, tags=["schedule"])
brain.store("Discuss quarterly report tomorrow", level=Level.Working, tags=["task"])
# Simulate a week passing — run maintenance cycles
for _ in range(7):
brain.run_maintenance() # decay + reflect + consolidate + archive
# Day 8: What does the agent remember?
context = brain.recall("user preferences and personality")
# -> Still remembers: vegan, jazz, schedule (Identity, strength ~0.93)
# -> "quarterly report" decayed heavily (Working, strength ~0.21)Identity persists. Tasks fade. Important patterns get promoted. Like a real brain.
MCP Server (Claude Desktop)
Give Claude persistent memory across conversations:
pip install aura-memoryAdd to Claude Desktop config (Settings → Developer → Edit Config):
{
"mcpServers": {
"aura": {
"command": "python",
"args": ["-m", "aura", "mcp", "C:\\Users\\YOUR_NAME\\aura_brain"]
}
}
}Provides 8 tools: recall, recall_structured, store, store_code, store_decision, search, insights, consolidate.
Dashboard UI
Aura includes a standalone web dashboard for visual memory management. Download from GitHub Releases.
./aura-dashboard ./my_brain --port 8000Features: Analytics · Memory Explorer with filtering · Recall Console with live scoring · Batch ingest
Platform | Binary |
Windows x64 |
|
Linux x64 |
|
macOS ARM |
|
macOS x64 |
|
Integrations & Examples
Try now: — zero install, runs in browser
Integration | Description | Link |
Ollama | Fully local AI assistant, no API key needed | |
LangChain | Drop-in Memory class + prompt injection | |
LlamaIndex | Chat engine with persistent memory recall | |
OpenAI Agents | Dynamic instructions with persistent memory | |
Claude SDK | System prompt injection + tool use patterns | |
CrewAI | Tool-based recall/store for crew agents | |
AutoGen | Memory protocol implementation | |
FastAPI | Per-user memory middleware with namespace isolation |
FFI (C/Go/C#): aura.h · go/main.go · csharp/Program.cs
More examples: basic_usage.py · encryption.py · agent_memory.py · edge_device.py · maintenance_daemon.py · research_bot.py
Architecture
52 Rust modules · ~23,500 lines · 272 Rust + 347 Python = 619 tests
Python ── from aura import Aura ──▶ aura._core (PyO3)
│
Rust ──────────────────────────────────────┘
┌─────────────────────────────────────────────┐
│ Aura Engine │
│ │
│ Two-Tier Memory │
│ ├── Cognitive Tier (Working + Decisions) │
│ └── Core Tier (Domain + Identity) │
│ │
│ Recall Engine (RRF Fusion, k=60) │
│ ├── SDR similarity (256k bit) │
│ ├── MinHash N-gram │
│ ├── Tag Jaccard │
│ └── Embedding (optional, pluggable) │
│ │
│ Adaptive Memory │
│ ├── Feedback learning (boost/weaken) │
│ ├── Snapshots & rollback │
│ ├── Supersede (version chains) │
│ └── Agent-to-agent sharing protocol │
│ │
│ Knowledge Graph · Living Memory │
│ Trust & Provenance · PII Guards │
│ Encryption (ChaCha20 + Argon2id) │
│ StorageBackend (Fs / Memory / WASM) │
│ Telemetry (Prometheus + OpenTelemetry) │
└─────────────────────────────────────────────┘API Reference
See docs/API.md for the complete API reference (40+ methods).
Roadmap
See docs/ROADMAP.md for the full development roadmap.
Completed (6 phases):
Phase 1 — Community & Trust: benchmarks, CONTRIBUTING.md, issue templates
Phase 2 — Ecosystem Gaps: LlamaIndex, temporal queries, event callbacks
Phase 3 — Drop-in Adoption: LangChain Memory class, FastAPI middleware, Claude SDK
Phase 4 — New Markets: C FFI + Go/C# examples, WASM storage abstraction
Phase 5 — Enterprise: Prometheus + OpenTelemetry, multimodal stubs, stress tests (100K/1M)
Phase 6 — Competitive Moat: adaptive recall, snapshots, agent sharing, semantic versioning
Remaining:
TypeScript/WASM build via
wasm-pack+ NPM package (storage abstraction done)Cloudflare Workers edge runtime (depends on WASM)
Java FFI example, PyPI publish, benchmark CI
Resources
Demo Video (30s) — Quick overview
API Reference — Complete API docs
Examples — Ready-to-run scripts
Roadmap — Development plan
Landing Page — Project overview
Contributing
Contributions welcome! See CONTRIBUTING.md for setup instructions and guidelines, or check the open issues.
⭐ If Aura saves you time, a
License & Intellectual Property
Code License: MIT — see LICENSE.
Patent Notice: The core cognitive architecture (DNA Layering, Cognitive Crystallization, SDR Indexing, Synaptic Synthesis) is Patent Pending (US Provisional Application No. 63/969,703). See PATENT for details. Commercial integration of these architectural concepts into enterprise products requires a commercial license. The open-source SDK is freely available under MIT for non-commercial, academic, and standard agent integrations.