# Graph Database Architecture for Memory Associations
**Status**: β
Production-Ready (v8.51.0+)
**Difficulty**: Easy
**Time to Migrate**: 10-15 minutes
> π― **Quick Win**: Migrate to get **30x faster queries** and **97% storage reduction** with zero downtime.
## What is This?
The graph database architecture stores memory associations (connections between memories) in a dedicated, high-performance graph table instead of as regular Memory objects. Think of it like upgrading from storing contacts in a text file to using a contacts database.
**Real-World Examples**:
**December 14, 2025** (Initial Testing):
- Consolidation system automatically created 343 associations
- Before: Stored as 343 Memory objects (2.8 MB, 150ms queries)
- After: Stored in graph table (144 KB, 5ms queries)
- **Result**: 30Γ faster, 97% smaller
**December 16, 2025** (Production Migration):
- Migrated 1,435 existing associations to graph table
- Deleted 1,432 legacy association memories (7 orphaned preserved)
- Database: 5,345 β 3,913 memories, graph table fully operational
- **Result**: 30Γ query performance, zero downtime
## Why Should I Care?
### The Problem
When you use the consolidation system, it discovers semantic connections between your memories (like "these two memories are related because they're about the same project"). Before v8.51.0, these associations were stored as regular Memory objects:
```
β Before (Association as Memory):
- Content: "Association: memory A β memory B"
- Embedding: 384-dimensional vector (1.5 KB)
- Metadata: Tags, timestamps, similarity scores
- Total size: ~500 bytes per association
- Query: Full table scan to find connections (150ms)
```
```
β
After (Graph Table):
- Source hash: abc123
- Target hash: def456
- Similarity: 0.65
- Connection types: ["temporal_proximity"]
- Total size: ~50 bytes per association
- Query: Indexed lookup (5ms)
```
### The Benefits
| Benefit | Impact | Who Benefits Most |
|---------|--------|-------------------|
| **30x faster queries** | 150ms β 5ms for finding connections | Heavy consolidation users |
| **97% storage reduction** | 2.8 MB β 144 KB for 1,449 associations | Users with 500+ associations |
| **Multi-hop queries** | New capability: "find memories 2-3 connections away" | Power users, researchers |
| **No search pollution** | Associations don't appear in semantic search | Everyone |
## How Does It Work?
### Three Storage Modes
The system supports three modes for gradual migration:
#### 1. `memories_only` (Legacy)
- Associations stored as Memory objects (old behavior)
- Use if: You want to stick with the old system
#### 2. `dual_write` (Default - Recommended During Migration)
- Associations written to BOTH memories + graph tables
- Use if: You're migrating and want safety/rollback capability
#### 3. `graph_only` (Modern - Recommended After Migration)
- Associations ONLY written to graph table
- Use if: You've migrated and want maximum performance
### Configuration
```bash
# Set in your .env file or environment
export MCP_GRAPH_STORAGE_MODE=graph_only
# Options:
# memories_only - Legacy (current behavior)
# dual_write - Safe migration mode (default)
# graph_only - Maximum performance (recommended)
```
## Quick Start: 3-Step Migration
### Step 1: Upgrade to v8.51.0
```bash
# Backup first (IMPORTANT!)
cp ~/.local/share/mcp-memory-service/memory.db \
~/.local/share/mcp-memory-service/memory.db.backup-$(date +%Y%m%d)
# Upgrade
pip install --upgrade mcp-memory-service==8.51.0
```
**What happens**: Service starts in `dual_write` mode (zero breaking changes)
### Step 2: Migrate Existing Associations
```bash
# Preview migration (safe, read-only)
python scripts/maintenance/backfill_graph_table.py --dry-run
# Expected output (if already in dual_write mode during testing):
# β Found 1,432 association memories
# β Skipped (duplicates): 1,432 (already in graph table)
# β Backfill not needed
# Or if migrating fresh:
# β Found 1,449 association memories
# β Ready to migrate: 1,435 bidirectional edges
# β Duration: ~30 seconds
# Execute migration (only if needed)
python scripts/maintenance/backfill_graph_table.py --apply
```
### Step 3: Switch to graph_only Mode
```bash
# Update your .env file
echo "export MCP_GRAPH_STORAGE_MODE=graph_only" >> ~/.bashrc
source ~/.bashrc
# Restart services
systemctl --user restart mcp-memory-http.service
```
### Step 4 (Optional): Cleanup Old Associations
```bash
# Preview deletions (safe)
python scripts/maintenance/cleanup_association_memories.py --dry-run
# Execute cleanup (interactive)
python scripts/maintenance/cleanup_association_memories.py
# Or automated (no confirmation)
python scripts/maintenance/cleanup_association_memories.py --force
# Actual output (December 16, 2025):
# β
Deleted 1,432 memories
# β
Orphaned preserved: 7
# β
VACUUM Complete!
# Space reclaimed: 0 bytes (file size may not shrink due to SQLite allocation)
```
## Graph Query Capabilities
### Find Connected Memories
```python
from mcp_memory_service.storage.graph import GraphStorage
graph = GraphStorage("~/.local/share/mcp-memory-service/memory.db")
# Find all memories within 2 hops
connected = await graph.find_connected("memory_hash_123", max_hops=2)
# Returns: [("hash_456", 1), ("hash_789", 2)] # (hash, distance)
```
### Shortest Path Between Memories
```python
# Find how two memories are connected
path = await graph.shortest_path("hash_A", "hash_B", max_depth=5)
# Returns: ["hash_A", "hash_intermediate", "hash_B"]
```
### Get Subgraph for Visualization
```python
# Extract neighborhood for graph visualization
subgraph = await graph.get_subgraph("hash_123", radius=2)
# Returns: {"nodes": [...], "edges": [...]}
```
## Performance Benchmarks
**Test Environment**: MacBook Pro M1, SQLite 3.43.0, 1,449 associations
| Query Type | Before | After | Improvement |
|------------|--------|-------|-------------|
| Find connected (1-hop) | 150ms | 5ms | **30x faster** |
| Find connected (3-hop) | 800ms | 25ms | **32x faster** |
| Shortest path | 1,200ms | 15ms | **80x faster** |
| Get subgraph | N/A | 10ms | **New!** |
## Real-World Impact
**Case Study: Production Deployment (December 16, 2025)**
- **Before Migration**:
- 5,345 total memories (1,439 association memories = 27%)
- Database size: 19.76 MB
- Query time: 150ms average
- Graph table: 2,870 entries (populated during 3-day testing)
- **After Migration**:
- 3,913 total memories (1,432 associations deleted, 7 orphaned preserved)
- 1,435 unique associations in graph table (2,870 bidirectional edges)
- Database size: 21 MB (increased due to graph indexes + new memories during testing)
- Query time: 5ms average
- **30Γ performance improvement maintained**
**Key Finding**: Database file size may increase due to graph table indexes, but query performance gains (30Γ) are preserved. SQLite VACUUM reclaims space internally without always shrinking file size.
## Migration Lessons Learned (December 16, 2025)
**Key Findings from Production Migration**:
1. **Backfill Already Complete**: If you've been running v8.51.0 in `dual_write` mode (default) during testing, associations are already in the graph table. The backfill script will show 100% duplicates - this is expected and safe.
2. **Database Size May Not Shrink**: VACUUM reclaims space internally but doesn't always reduce file size due to SQLite's page allocation strategy. Our migration showed 19.76 MB β 21 MB (+6.3%) due to:
- Graph table indexes for CTE queries
- Bidirectional edges (AβB and BβA)
- New memories added during testing period
- **Impact**: None - query performance still 30Γ faster
3. **Orphaned Associations**: Expect 0.5-1% of associations to be orphaned (incomplete metadata during migration). These are safely preserved. In our case: 7 out of 1,439 (0.5%).
4. **HTTP Server Must Be Stopped**: Database locks will prevent migration if HTTP server or MCP clients are connected. Always stop services first:
```bash
# Stop HTTP server
systemctl --user stop mcp-memory-http.service # Linux
# Or kill processes manually (macOS)
```
5. **Backups Are Automatic**: Both scripts create timestamped backups before execution. Safe to run. Backup created: `sqlite_vec.backup-20251216-063051.db`
6. **Zero Downtime**: Migration completed in ~3 minutes with zero service interruption. Dual write mode ensures safety.
## FAQ
### Q: Do I need to migrate?
**A**: Not immediately, but recommended if:
- β
You have 500+ associations
- β
Query latency >100ms is noticeable
- β
Database size is growing rapidly
- β
You use consolidation regularly
**Can defer if**:
- You have <100 associations
- Consolidation is disabled
- Storage space is not a concern
### Q: Is migration reversible?
**A**: Yes! You can rollback at any time:
```bash
export MCP_GRAPH_STORAGE_MODE=memories_only
systemctl --user restart mcp-memory-http.service
```
### Q: Will this affect my existing memories?
**A**: No impact on regular memories:
- β
Only association memories are affected
- β
Semantic search unchanged
- β
Memory retrieval performance unaffected
- β
Embeddings and metadata preserved
### Q: What if something goes wrong?
**A**: Multiple safety nets:
1. **Backup**: Created in Step 1 (restore anytime)
2. **Dry-run**: Preview before executing
3. **Transaction safety**: Rollback on errors
4. **Dual-write mode**: Keep both versions during migration
### Q: How much space will I save?
**A**: Depends on association count:
- **1,000 associations**: ~2.3 MB reclaimed
- **1,449 associations** (our deployment): ~2.8 MB
- **5,000 associations**: ~11.5 MB
Formula: `associations Γ 450 bytes β space reclaimed`
## Troubleshooting
### "Database is locked" error
**Solution**:
```bash
# Stop HTTP server before migration
systemctl --user stop mcp-memory-http.service
# Re-run migration
python scripts/maintenance/backfill_graph_table.py --apply
# Restart server
systemctl --user restart mcp-memory-http.service
```
### "No associations found" in dry-run
**Solution**:
```bash
# Check if consolidation has run
curl http://127.0.0.1:8000/api/consolidation/status
# If no jobs executed, trigger consolidation
curl -X POST http://127.0.0.1:8000/api/consolidation/trigger \
-H "Content-Type: application/json" \
-d '{"time_horizon": "weekly"}'
# Wait for completion, then re-run backfill
python scripts/maintenance/backfill_graph_table.py --dry-run
```
### Cleanup deletes fewer memories than expected
**This is normal and safe**:
- Script only deletes memories confirmed in graph table
- Preserves orphaned associations (safety first)
- You still get query performance benefits
## Next Steps
After successful migration:
1. **Monitor Performance**:
```bash
# Check association count growth
sqlite3 ~/.local/share/mcp-memory-service/memory.db \
"SELECT COUNT(*)/2 AS associations FROM memory_graph;"
```
2. **Verify Graph Queries**:
```bash
# Test query performance
time sqlite3 ~/.local/share/mcp-memory-service/memory.db \
"SELECT target_hash FROM memory_graph WHERE source_hash='YOUR_HASH' LIMIT 10;"
# Expected: <0.01s (10ms)
```
3. **Explore Graph Features** (coming in v8.52.0):
- REST API endpoints: `/api/graph/connected/{hash}`
- Graph visualization in web UI
- Advanced analytics with rustworkx (v9.0+)
## Related Documentation
- **Architecture Details**: [Graph Database Design](https://github.com/doobidoo/mcp-memory-service/blob/main/docs/architecture/graph-database-design.md)
- **Migration Guide**: [Step-by-Step Migration](https://github.com/doobidoo/mcp-memory-service/blob/main/docs/migration/graph-migration-guide.md)
- **Issue #279**: [Original Feature Request](https://github.com/doobidoo/mcp-memory-service/issues/279)
- **PR #280**: [Implementation](https://github.com/doobidoo/mcp-memory-service/pull/280)
## Support
Need help? Open an issue:
- **Bug Reports**: [GitHub Issues](https://github.com/doobidoo/mcp-memory-service/issues)
- **Questions**: [GitHub Discussions](https://github.com/doobidoo/mcp-memory-service/discussions)
---
**Last Updated**: 2025-12-16
**Version**: 1.1 (for MCP Memory Service v8.51.0)
**Contributors**: MCP Memory Service team