Skip to main content
Glama
teolex2020

aura-memory

by teolex2020

LLMs forget everything. Every conversation starts from zero. Existing memory solutions — Mem0, Zep, Cognee — require LLM calls for basic recall, adding latency, cloud dependency, and cost to every operation.

Aura gives your AI agent persistent, hierarchical memory that decays, consolidates, and evolves — like a human brain. One pip install, works fully offline.

pip install aura-memory
from aura import Aura, Level

brain = Aura("./agent_memory")

brain.store("User prefers dark mode", level=Level.Identity, tags=["ui"])
brain.store("Deploy to staging first", level=Level.Decisions, tags=["workflow"])

context = brain.recall("user preferences")  # <1ms — inject into any LLM prompt

Your agent now remembers. No API keys. No embeddings. No config.

⭐ If AuraSDK is useful to you, a


Why Aura?

Aura

Mem0

Zep

Cognee

Letta/MemGPT

LLM required

No

Yes

Yes

Yes

Yes

Recall latency

<1ms

~200ms+

~200ms

LLM-bound

LLM-bound

Works offline

Fully

Partial

No

No

With local LLM

Cost per operation

$0

API billing

Credit-based

LLM + DB cost

LLM cost

Binary size

~3 MB

~50 MB+

Cloud service

Heavy (Neo4j+)

Python pkg

Memory decay & promotion

Built-in

Via LLM

Via LLM

No

Via LLM

Trust & provenance

Built-in

No

No

No

No

Encryption at rest

ChaCha20 + Argon2

No

No

No

No

Language

Rust

Python

Proprietary

Python

Python

Performance

Benchmarked on 1,000 records (Windows 10 / Ryzen 7):

Operation

Latency

vs Mem0

Store

0.09 ms

~same

Recall (structured)

0.74 ms

~270× faster

Recall (cached)

0.48 µs

~400,000× faster

Maintenance cycle

1.1 ms

No equivalent

Mem0 recall requires an embedding API call (~200ms+) + vector search. Aura recall is pure local computation.


How Memory Works

Aura organizes memories into 4 levels across 2 tiers. Important memories persist, trivial ones decay naturally:

CORE TIER (slow decay — weeks to months)
  Identity  [0.99]  Who the user is. Preferences. Personality.
  Domain    [0.95]  Learned facts. Domain knowledge.

COGNITIVE TIER (fast decay — hours to days)
  Decisions [0.90]  Choices made. Action items.
  Working   [0.80]  Current tasks. Recent context.

One call runs the full lifecycle — decay, promote, merge duplicates, archive expired:

report = brain.run_maintenance()  # 8 phases, <1ms

Key Features

Core Memory Engine

  • RRF Fusion Recall — Multi-signal ranking: SDR + MinHash + Tag Jaccard (+ optional embeddings)

  • Two-Tier Memory — Cognitive (ephemeral) + Core (permanent) with decay, promotion, and archival

  • Background Maintenance — 8-phase lifecycle: decay, reflect, insights, consolidation, archival

  • Namespace Isolationnamespace="sandbox" keeps test data invisible to production recall

  • Pluggable Embeddings — Optional 4th RRF signal: bring your own embedding function

Trust & Safety

  • Trust & Provenance — Source authority scoring: user input outranks web scrapes, automatically

  • Source Type Tracking — Every memory carries provenance: recorded, retrieved, inferred, generated

  • Auto-Protect Guards — Detects phone numbers, emails, wallets, API keys automatically

  • Encryption — ChaCha20-Poly1305 with Argon2id key derivation

Adaptive Memory

  • Feedback Learningbrain.feedback(id, useful=True) boosts useful memories, weakens noise

  • Semantic Versioningbrain.supersede(old_id, new_content) with full version chains

  • Snapshots & Rollbackbrain.snapshot("v1") / brain.rollback("v1") / brain.diff("v1","v2")

  • Agent-to-Agent Sharingexport_context() / import_context() with trust metadata

Enterprise & Integrations

  • Multimodal Stubsstore_image() / store_audio_transcript() with media provenance

  • Prometheus Metrics/metrics endpoint with 10+ business-level counters and histograms

  • OpenTelemetrytelemetry feature flag with OTLP export and 17 instrumented spans

  • MCP Server — Claude Desktop integration out of the box

  • WASM-ReadyStorageBackend trait abstraction (FsBackend + MemoryBackend)

  • Pure Rust Core — No Python dependencies, no external services


Quick Start

Trust & Provenance

from aura import Aura, TrustConfig

brain = Aura("./data")

tc = TrustConfig()
tc.source_trust = {"user": 1.0, "api": 0.8, "web_scrape": 0.5}
brain.set_trust_config(tc)

# User facts always rank higher than scraped data in recall
brain.store("User is vegan", channel="user")
brain.store("User might like steak restaurants", channel="web_scrape")

results = brain.recall_structured("food preferences", top_k=5)
# -> "User is vegan" scores higher, always

Pluggable Embeddings (Optional)

from aura import Aura

brain = Aura("./data")

# Plug in any embedding function: OpenAI, Ollama, sentence-transformers, etc.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("all-MiniLM-L6-v2")
brain.set_embedding_fn(lambda text: model.encode(text).tolist())

# Now "login problems" matches "Authentication failed" via semantic similarity
brain.store("Authentication failed for user admin")
results = brain.recall_structured("login problems", top_k=5)

Without embeddings, Aura falls back to SDR + MinHash + Tag Jaccard — still fast, still effective.

Encryption

brain = Aura("./secret_data", password="my-secure-password")
brain.store("Top secret information")
assert brain.is_encrypted()  # ChaCha20-Poly1305 + Argon2id

Namespace Isolation

brain = Aura("./data")

brain.store("Real preference: dark mode", namespace="default")
brain.store("Test: user likes light mode", namespace="sandbox")

# Recall only sees "default" namespace — sandbox is invisible
results = brain.recall_structured("user preference", top_k=5)

Cookbook: Personal Assistant That Remembers

The killer use case: an agent that remembers your preferences after a week offline, with zero API calls.

See examples/personal_assistant.py for the full runnable script.

from aura import Aura, Level

brain = Aura("./assistant_memory")

# Day 1: User tells the agent about themselves
brain.store("User is vegan", level=Level.Identity, tags=["diet"])
brain.store("User loves jazz music", level=Level.Identity, tags=["music"])
brain.store("User works 10am-6pm", level=Level.Identity, tags=["schedule"])
brain.store("Discuss quarterly report tomorrow", level=Level.Working, tags=["task"])

# Simulate a week passing — run maintenance cycles
for _ in range(7):
    brain.run_maintenance()  # decay + reflect + consolidate + archive

# Day 8: What does the agent remember?
context = brain.recall("user preferences and personality")
# -> Still remembers: vegan, jazz, schedule (Identity, strength ~0.93)
# -> "quarterly report" decayed heavily (Working, strength ~0.21)

Identity persists. Tasks fade. Important patterns get promoted. Like a real brain.


MCP Server (Claude Desktop)

Give Claude persistent memory across conversations:

pip install aura-memory

Add to Claude Desktop config (Settings → Developer → Edit Config):

{
  "mcpServers": {
    "aura": {
      "command": "python",
      "args": ["-m", "aura", "mcp", "C:\\Users\\YOUR_NAME\\aura_brain"]
    }
  }
}

Provides 8 tools: recall, recall_structured, store, store_code, store_decision, search, insights, consolidate.


Dashboard UI

Aura includes a standalone web dashboard for visual memory management. Download from GitHub Releases.

./aura-dashboard ./my_brain --port 8000

Features: Analytics · Memory Explorer with filtering · Recall Console with live scoring · Batch ingest

Platform

Binary

Windows x64

aura-dashboard-windows-x64.exe

Linux x64

aura-dashboard-linux-x64

macOS ARM

aura-dashboard-macos-arm64

macOS x64

aura-dashboard-macos-x64


Integrations & Examples

Try now: Open In Colab — zero install, runs in browser

Integration

Description

Link

Ollama

Fully local AI assistant, no API key needed

ollama_agent.py

LangChain

Drop-in Memory class + prompt injection

langchain_agent.py

LlamaIndex

Chat engine with persistent memory recall

llamaindex_agent.py

OpenAI Agents

Dynamic instructions with persistent memory

openai_agents.py

Claude SDK

System prompt injection + tool use patterns

claude_sdk_agent.py

CrewAI

Tool-based recall/store for crew agents

crewai_agent.py

AutoGen

Memory protocol implementation

autogen_agent.py

FastAPI

Per-user memory middleware with namespace isolation

fastapi_middleware.py

FFI (C/Go/C#): aura.h · go/main.go · csharp/Program.cs

More examples: basic_usage.py · encryption.py · agent_memory.py · edge_device.py · maintenance_daemon.py · research_bot.py


Architecture

52 Rust modules · ~23,500 lines · 272 Rust + 347 Python = 619 tests

Python  ──  from aura import Aura  ──▶  aura._core (PyO3)
                                              │
Rust    ──────────────────────────────────────┘
        ┌─────────────────────────────────────────────┐
        │  Aura Engine                                │
        │                                             │
        │  Two-Tier Memory                            │
        │  ├── Cognitive Tier (Working + Decisions)   │
        │  └── Core Tier (Domain + Identity)          │
        │                                             │
        │  Recall Engine (RRF Fusion, k=60)           │
        │  ├── SDR similarity (256k bit)              │
        │  ├── MinHash N-gram                         │
        │  ├── Tag Jaccard                            │
        │  └── Embedding (optional, pluggable)        │
        │                                             │
        │  Adaptive Memory                            │
        │  ├── Feedback learning (boost/weaken)       │
        │  ├── Snapshots & rollback                   │
        │  ├── Supersede (version chains)             │
        │  └── Agent-to-agent sharing protocol        │
        │                                             │
        │  Knowledge Graph · Living Memory            │
        │  Trust & Provenance · PII Guards            │
        │  Encryption (ChaCha20 + Argon2id)           │
        │  StorageBackend (Fs / Memory / WASM)        │
        │  Telemetry (Prometheus + OpenTelemetry)      │
        └─────────────────────────────────────────────┘

API Reference

See docs/API.md for the complete API reference (40+ methods).

Roadmap

See docs/ROADMAP.md for the full development roadmap.

Completed (6 phases):

  • Phase 1 — Community & Trust: benchmarks, CONTRIBUTING.md, issue templates

  • Phase 2 — Ecosystem Gaps: LlamaIndex, temporal queries, event callbacks

  • Phase 3 — Drop-in Adoption: LangChain Memory class, FastAPI middleware, Claude SDK

  • Phase 4 — New Markets: C FFI + Go/C# examples, WASM storage abstraction

  • Phase 5 — Enterprise: Prometheus + OpenTelemetry, multimodal stubs, stress tests (100K/1M)

  • Phase 6 — Competitive Moat: adaptive recall, snapshots, agent sharing, semantic versioning

Remaining:

  • TypeScript/WASM build via wasm-pack + NPM package (storage abstraction done)

  • Cloudflare Workers edge runtime (depends on WASM)

  • Java FFI example, PyPI publish, benchmark CI

Resources


Contributing

Contributions welcome! See CONTRIBUTING.md for setup instructions and guidelines, or check the open issues.

If Aura saves you time, a


License & Intellectual Property

  • Code License: MIT — see LICENSE.

  • Patent Notice: The core cognitive architecture (DNA Layering, Cognitive Crystallization, SDR Indexing, Synaptic Synthesis) is Patent Pending (US Provisional Application No. 63/969,703). See PATENT for details. Commercial integration of these architectural concepts into enterprise products requires a commercial license. The open-source SDK is freely available under MIT for non-commercial, academic, and standard agent integrations.


-
security - not tested
-
license - not tested
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/teolex2020/aura-memory'

If you have feedback or need assistance with the MCP directory API, please join our Discord server