sqlite-vec-backend.md•17 kB
# SQLite-vec Backend Guide
## Overview
The MCP Memory Service now supports SQLite-vec as an alternative storage backend. SQLite-vec provides a lightweight, high-performance vector database solution that offers several advantages over ChromaDB:
- **Lightweight**: Single file database with no external dependencies
- **Fast**: Optimized vector operations with efficient indexing
- **Portable**: Easy to backup, copy, and share memory databases
- **Reliable**: Built on SQLite's proven reliability and ACID compliance
- **Memory Efficient**: Lower memory footprint for smaller memory collections
## Installation
### Prerequisites
The sqlite-vec backend requires the `sqlite-vec` Python package:
```bash
# Install sqlite-vec
pip install sqlite-vec
# Or with UV (recommended)
uv add sqlite-vec
```
### Verification
You can verify sqlite-vec is available by running:
```python
try:
    import sqlite_vec
    print("✅ sqlite-vec is available")
except ImportError:
    print("❌ sqlite-vec is not installed")
```
## Configuration
### Environment Variables
To use the sqlite-vec backend, set the storage backend environment variable:
```bash
# Primary configuration
export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
# Optional: Custom database path
export MCP_MEMORY_SQLITE_PATH=/path/to/your/memory.db
```
### Platform-Specific Setup
#### macOS (Bash/Zsh)
```bash
# Add to ~/.bashrc or ~/.zshrc
export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
export MCP_MEMORY_SQLITE_PATH="$HOME/Library/Application Support/mcp-memory/sqlite_vec.db"
```
#### Windows (PowerShell)
```powershell
# Add to PowerShell profile
$env:MCP_MEMORY_STORAGE_BACKEND = "sqlite_vec"
$env:MCP_MEMORY_SQLITE_PATH = "$env:LOCALAPPDATA\mcp-memory\sqlite_vec.db"
```
#### Windows (Command Prompt)
```cmd
set MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
set MCP_MEMORY_SQLITE_PATH=%LOCALAPPDATA%\mcp-memory\sqlite_vec.db
```
#### Linux
```bash
# Add to ~/.bashrc
export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
export MCP_MEMORY_SQLITE_PATH="$HOME/.local/share/mcp-memory/sqlite_vec.db"
```
### Claude Desktop Configuration
Update your Claude Desktop MCP configuration:
```json
{
  "mcpServers": {
    "memory": {
      "command": "uv",
      "args": ["--directory", "/path/to/mcp-memory-service", "run", "memory"],
      "env": {
        "MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec"
      }
    }
  }
}
```
## Migration from ChromaDB
### Automatic Migration
Use the provided migration script for easy migration:
```bash
# Simple migration with default paths
python migrate_to_sqlite_vec.py
# Custom migration
python scripts/migrate_storage.py \
  --from chroma \
  --to sqlite_vec \
  --source-path /path/to/chroma_db \
  --target-path /path/to/sqlite_vec.db \
  --backup
```
### Manual Migration Steps
1. **Stop the MCP Memory Service**
   ```bash
   # Stop Claude Desktop or any running instances
   ```
2. **Create a backup** (recommended)
   ```bash
   python scripts/migrate_storage.py \
     --from chroma \
     --to sqlite_vec \
     --source-path ~/.local/share/mcp-memory/chroma_db \
     --target-path ~/.local/share/mcp-memory/sqlite_vec.db \
     --backup \
     --backup-path memory_backup.json
   ```
3. **Set environment variables**
   ```bash
   export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
   ```
4. **Restart Claude Desktop**
### Migration Verification
After migration, verify your memories are accessible:
```bash
# Test the new backend
python scripts/verify_environment.py
# Check database statistics
python -c "
import asyncio
from src.mcp_memory_service.storage.sqlite_vec import SqliteVecMemoryStorage
async def check_stats():
    storage = SqliteVecMemoryStorage('path/to/your/db')
    await storage.initialize()
    stats = storage.get_stats()
    print(f'Total memories: {stats[\"total_memories\"]}')
    print(f'Database size: {stats[\"database_size_mb\"]} MB')
    storage.close()
asyncio.run(check_stats())
"
```
## Performance Characteristics
### Memory Usage
| Collection Size | ChromaDB RAM | SQLite-vec RAM | Difference |
|----------------|--------------|----------------|------------|
| 1,000 memories | ~200 MB | ~50 MB | -75% |
| 10,000 memories | ~800 MB | ~200 MB | -75% |
| 100,000 memories | ~4 GB | ~1 GB | -75% |
### Query Performance
- **Semantic Search**: Similar performance to ChromaDB for most use cases
- **Tag Search**: Faster due to SQL indexing
- **Metadata Queries**: Significantly faster with SQL WHERE clauses
- **Startup Time**: 2-3x faster initialization
### Storage Characteristics
- **Database File**: Single `.db` file (easy backup/restore)
- **Disk Usage**: ~30% smaller than ChromaDB for same data
- **Concurrent Access**: SQLite-level locking (single writer, multiple readers)
## Advanced Configuration
### Custom Embedding Models
```python
# Initialize with custom model
storage = SqliteVecMemoryStorage(
    db_path="memory.db",
    embedding_model="all-mpnet-base-v2"  # Higher quality, slower
)
```
### Multi-Client Access Configuration
SQLite-vec supports advanced multi-client access through **two complementary approaches**:
1. **Phase 1: WAL Mode** - Direct SQLite access with Write-Ahead Logging
2. **Phase 2: HTTP Coordination** - Automatic HTTP server coordination for seamless multi-client access
#### Phase 1: WAL Mode (Default)
The backend automatically enables WAL mode with these default settings:
- **WAL Mode**: Enables multiple readers + single writer
- **Busy Timeout**: 5 seconds (prevents immediate lock errors)  
- **Synchronous**: NORMAL (balanced performance/safety)
#### Phase 2: HTTP Server Auto-Detection (Advanced)
The system automatically detects the optimal coordination mode:
**Auto-Detection Modes:**
- **`http_client`**: Existing HTTP server detected → Connect as client
- **`http_server`**: No server found, port available → Start HTTP server
- **`direct`**: Port in use by other service → Fall back to WAL mode
**Coordination Flow:**
1. Check if MCP Memory Service HTTP server is running
2. If found → Use HTTP client to connect to existing server
3. If not found and port available → Auto-start HTTP server (optional)
4. If port busy → Fall back to direct SQLite with WAL mode
#### Custom SQLite Pragmas
You can customize SQLite behavior using environment variables:
```bash
# Recommended configuration (v8.9.0+) - For concurrent HTTP + MCP access
export MCP_MEMORY_SQLITE_PRAGMAS="busy_timeout=15000,cache_size=20000"
# Example configurations for different scenarios:
# High concurrency setup (longer timeout)
export MCP_MEMORY_SQLITE_PRAGMAS="busy_timeout=30000,cache_size=20000,wal_autocheckpoint=1000"
# Performance optimized (use with caution - trades safety for speed)
export MCP_MEMORY_SQLITE_PRAGMAS="synchronous=NORMAL,temp_store=MEMORY,cache_size=50000,busy_timeout=15000"
# Conservative/safe mode (maximum data safety)
export MCP_MEMORY_SQLITE_PRAGMAS="synchronous=FULL,busy_timeout=60000,cache_size=20000"
```
#### HTTP Coordination Configuration
Enable automatic HTTP server coordination for optimal multi-client access:
```bash
# Enable HTTP server auto-start
export MCP_HTTP_ENABLED=true
# Configure HTTP server settings (optional)
export MCP_HTTP_PORT=8000
export MCP_HTTP_HOST=localhost
# Combine with SQLite-vec backend
export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
```
**Coordination Modes Explained:**
1. **Automatic Mode (Recommended)**
   ```bash
   # No configuration needed - auto-detects best mode
   export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
   ```
2. **Forced HTTP Client Mode**
   ```bash
   # Always connect to existing server (fails if none running)
   export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
   export MCP_HTTP_ENABLED=false
   # Requires running: python scripts/run_http_server.py
   ```
3. **Direct WAL Mode Only**
   ```bash
   # Disable HTTP coordination entirely
   export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
   export MCP_HTTP_ENABLED=false
   export MCP_HTTP_ENABLED=false
   ```
#### Multi-Client Claude Desktop Configuration
**Option 1: Automatic Coordination (Recommended)**
```json
{
  "mcpServers": {
    "memory": {
      "command": "uv",
      "args": ["--directory", "/path/to/mcp-memory-service", "run", "memory"],
      "env": {
        "MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec",
        "MCP_HTTP_ENABLED": "true"
      }
    }
  }
}
```
**Option 2: Manual HTTP Server + Client Mode**
```json
{
  "mcpServers": {
    "memory": {
      "command": "uv", 
      "args": ["--directory", "/path/to/mcp-memory-service", "run", "memory"],
      "env": {
        "MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec",
        "MCP_HTTP_ENABLED": "false"
      }
    }
  }
}
```
*Note: Requires manually running `python scripts/run_http_server.py` first*
**Option 3: WAL Mode Only (Simple)**
```json
{
  "mcpServers": {
    "memory": {
      "command": "uv",
      "args": ["--directory", "/path/to/mcp-memory-service", "run", "memory"],
      "env": {
        "MCP_MEMORY_STORAGE_BACKEND": "sqlite_vec",
        "MCP_MEMORY_SQLITE_PRAGMAS": "busy_timeout=10000"
      }
    }
  }
}
```
### Database Optimization
```bash
# Optimize database periodically
python -c "
import asyncio
from src.mcp_memory_service.storage.sqlite_vec import SqliteVecMemoryStorage
async def optimize():
    storage = SqliteVecMemoryStorage('path/to/db')
    await storage.initialize()
    
    # Clean up duplicates
    count, msg = await storage.cleanup_duplicates()
    print(f'Cleaned up {count} duplicates')
    
    # Vacuum database
    storage.conn.execute('VACUUM')
    print('Database vacuumed')
    
    storage.close()
asyncio.run(optimize())
"
```
### Backup and Restore
```bash
# Create backup
python scripts/migrate_storage.py \
  --from sqlite_vec \
  --to sqlite_vec \
  --source-path memory.db \
  --target-path backup.db
# Or simple file copy
cp memory.db memory_backup.db
# Restore from JSON backup
python scripts/migrate_storage.py \
  --restore backup.json \
  --to sqlite_vec \
  --target-path restored_memory.db
```
## Troubleshooting
### Common Issues
#### 1. sqlite-vec Not Found
```
ImportError: No module named 'sqlite_vec'
```
**Solution**: Install sqlite-vec package
```bash
pip install sqlite-vec
# or
uv add sqlite-vec
```
#### 2. Database Lock Errors
```
sqlite3.OperationalError: database is locked
```
**✅ Fixed in v8.9.0** - Proper SQLite pragmas now automatically configured by installer
**For Single Client Issues:**
```bash
# Kill existing processes
pkill -f "mcp-memory-service"
# Restart Claude Desktop
```
**For Multi-Client Setup (Claude Desktop + Claude Code + HTTP Server):**
```bash
# v8.9.0+ Solution: Configure recommended pragma values
export MCP_MEMORY_SQLITE_PRAGMAS="busy_timeout=15000,cache_size=20000"
# Restart all services to apply changes
# Note: Installer automatically sets these for hybrid/sqlite_vec backends
# If issues persist, try longer timeout:
export MCP_MEMORY_SQLITE_PRAGMAS="busy_timeout=30000,cache_size=20000"
# Check for stale lock files (rare)
ls -la /path/to/your/database-wal
ls -la /path/to/your/database-shm
# If stale locks exist (no active processes), remove them
rm /path/to/your/database-wal
rm /path/to/your/database-shm
# 4. Restart all MCP clients
```
**Prevention Tips:**
- Always use WAL mode (enabled by default)
- Configure appropriate busy timeouts for your use case
- Ensure proper shutdown of MCP clients
- Use connection retry logic (built-in)
#### 5. HTTP Coordination Issues
```
Failed to initialize HTTP client storage: Connection refused
```
**Solutions:**
**Auto-Detection Problems:**
```bash
# Check if HTTP server auto-start is working
export LOG_LEVEL=DEBUG
export MCP_HTTP_ENABLED=true
# Check coordination mode detection
python -c "
import asyncio
from src.mcp_memory_service.utils.port_detection import detect_server_coordination_mode
print(asyncio.run(detect_server_coordination_mode()))
"
```
**Manual HTTP Server Setup:**
```bash
# Start HTTP server manually in separate terminal
python scripts/run_http_server.py
# Then start MCP clients (they'll auto-detect the running server)
```
**Port Conflicts:**
```bash
# Check what's using the port
netstat -an | grep :8000  # Linux/macOS
netstat -an | findstr :8000  # Windows
# Use different port
export MCP_HTTP_PORT=8001
```
**Fallback to WAL Mode:**
```bash
# Force WAL mode if HTTP coordination fails
export MCP_HTTP_ENABLED=false
export MCP_HTTP_ENABLED=false
```
#### 3. Permission Errors
```
PermissionError: [Errno 13] Permission denied
```
**Solution**: Check database file permissions
```bash
# Fix permissions
chmod 644 /path/to/sqlite_vec.db
chmod 755 /path/to/directory
```
#### 4. Migration Failures
```
Migration failed: No memories found
```
**Solution**: Verify source path and initialize if needed
```bash
# Check source exists
ls -la /path/to/chroma_db
# Use absolute paths in migration
```
### Debug Mode
Enable debug logging for troubleshooting:
```bash
export LOG_LEVEL=DEBUG
export DEBUG_MODE=1
# Run your MCP client
```
### Health Checks
```python
# Check backend health
import asyncio
from src.mcp_memory_service.storage.sqlite_vec import SqliteVecMemoryStorage
async def health_check():
    storage = SqliteVecMemoryStorage('path/to/db')
    await storage.initialize()
    
    stats = storage.get_stats()
    print(f"Backend: {stats['backend']}")
    print(f"Total memories: {stats['total_memories']}")
    print(f"Database size: {stats['database_size_mb']} MB")
    print(f"Embedding model: {stats['embedding_model']}")
    
    storage.close()
asyncio.run(health_check())
```
## Comparison: ChromaDB vs SQLite-vec
| Feature | ChromaDB | SQLite-vec | Winner |
|---------|----------|------------|--------|
| Setup Complexity | Medium | Low | SQLite-vec |
| Memory Usage | High | Low | SQLite-vec |
| Query Performance | Excellent | Very Good | ChromaDB |
| Portability | Poor | Excellent | SQLite-vec |
| Backup/Restore | Complex | Simple | SQLite-vec |
| Concurrent Access | Good | Excellent (HTTP + WAL) | SQLite-vec |
| Multi-Client Support | Good | Excellent (HTTP + WAL) | SQLite-vec |
| Ecosystem | Rich | Growing | ChromaDB |
| Reliability | Good | Excellent | SQLite-vec |
## Best Practices
### When to Use SQLite-vec
✅ **Use SQLite-vec when:**
- Memory collections < 100,000 entries
- Multi-client access needed (Claude Desktop + Claude Code + others)
- Seamless setup and coordination required (auto-detection)
- Portability and backup simplicity are important
- Limited system resources
- Simple deployment requirements
- Want both HTTP and direct access capabilities
### When to Use ChromaDB
✅ **Use ChromaDB when:**
- Memory collections > 100,000 entries
- Heavy concurrent usage
- Maximum query performance is critical
- Rich ecosystem features needed
- Distributed setups
### Multi-Client Coordination Tips
1. **Automatic Mode (Recommended)**
   ```bash
   # Let the system choose the best coordination method
   export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec
   export MCP_HTTP_ENABLED=true
   ```
2. **Monitoring Coordination Mode**
   ```bash
   # Check which mode is being used
   export LOG_LEVEL=INFO
   # Look for "Detected coordination mode: ..." in logs
   ```
3. **HTTP Server Management**
   ```bash
   # Manual server control
   python scripts/run_http_server.py  # Start manually
   
   # Check server health
   curl http://localhost:8000/health
   ```
4. **Fallback Strategy**
   ```bash
   # If HTTP coordination fails, system falls back to WAL mode
   # No manual intervention needed - fully automatic
   ```
### Performance Tips
1. **Regular Optimization**
   ```bash
   # Run monthly
   python scripts/optimize_sqlite_vec.py
   ```
2. **Batch Operations**
   ```python
   # Store memories in batches for better performance
   for batch in chunk_memories(all_memories, 100):
       for memory in batch:
           await storage.store(memory)
   ```
3. **Index Maintenance**
   ```sql
   -- Rebuild indexes periodically
   REINDEX;
   VACUUM;
   ```
## API Reference
The sqlite-vec backend implements the same `MemoryStorage` interface as ChromaDB:
```python
# All standard operations work identically
await storage.store(memory)
results = await storage.retrieve(query, n_results=5)
memories = await storage.search_by_tag(["tag1", "tag2"])
success, msg = await storage.delete(content_hash)
success, msg = await storage.update_memory_metadata(hash, updates)
```
See the main API documentation for complete method signatures.
## Contributing
To contribute to sqlite-vec backend development:
1. Run tests: `pytest tests/test_sqlite_vec_storage.py`
2. Check performance: `python tests/performance/test_sqlite_vec_perf.py`
3. Add features following the `MemoryStorage` interface
4. Update this documentation
## Support
For sqlite-vec backend issues:
1. Check [sqlite-vec documentation](https://github.com/asg017/sqlite-vec)
2. Review this guide's troubleshooting section
3. Open an issue on the [MCP Memory Service repository](https://github.com/user/mcp-memory-service/issues)