background-relations.md•11.6 kB
# Background Relation Resolution
**Status**: Performance Enhancement
**PR**: #319
**Impact**: Faster MCP server startup, no blocking on cold start
## What Changed
v0.15.0 moves **entity relation resolution to background threads**, eliminating startup blocking when the MCP server initializes. This provides instant responsiveness even with large knowledge bases.
## The Problem (Before v0.15.0)
### Cold Start Blocking
**Previous behavior:**
```python
# MCP server initialization
async def init():
# Load all entities
entities = await load_entities()
# BLOCKING: Resolve all relations synchronously
for entity in entities:
await resolve_relations(entity) # ← Blocks startup
# Finally ready
return "Ready"
```
**Impact:**
- Large knowledge bases (1000+ entities) took **10-30 seconds** to start
- MCP server unresponsive during initialization
- Claude Desktop showed "connecting..." for extended period
- Poor user experience on cold start
### Example Timeline (Before)
```
0s: MCP server starts
0s: Load 2000 entities (fast)
1s: Start resolving relations...
25s: Still resolving...
30s: Finally ready!
30s: Accept first request
```
## The Solution (v0.15.0+)
### Non-Blocking Background Resolution
**New behavior:**
```python
# MCP server initialization
async def init():
# Load all entities (fast)
entities = await load_entities()
# NON-BLOCKING: Queue relations for background resolution
queue_background_resolution(entities) # ← Returns immediately
# Ready instantly!
return "Ready"
```
**Background worker:**
```python
# Separate thread pool processes relations
async def background_worker():
while True:
entity = await relation_queue.get()
await resolve_relations(entity) # ← In background
```
### Example Timeline (After)
```
0s: MCP server starts
0s: Load 2000 entities
0s: Queue for background resolution
0s: Ready! Accept requests
0s: (Background: resolving relations...)
5s: (Background: 50% complete...)
10s: (Background: 100% complete)
```
**Result:** Server ready in **<1 second** instead of 30 seconds
## How It Works
### Architecture
```
┌─────────────────┐
│ MCP Server │
│ Initialization │
└────────┬────────┘
│
│ 1. Load entities (fast)
│
▼
┌────────────────────┐
│ Relation Queue │ ← 2. Queue for processing
└────────┬───────────┘
│
│ 3. Return immediately
│
▼
┌────────────────────┐
│ Background Workers │ ← 4. Process in parallel
│ (Thread Pool) │ (non-blocking)
└────────────────────┘
```
### Thread Pool Configuration
```python
# Configurable thread pool size
sync_thread_pool_size: int = Field(
default=4,
description="Number of threads for background sync operations"
)
```
**Default:** 4 worker threads
### Processing Queue
```python
# Background processing queue
relation_queue = asyncio.Queue()
# Add entities for processing
for entity in entities:
await relation_queue.put(entity)
# Workers process queue
async def worker():
while True:
entity = await relation_queue.get()
await resolve_entity_relations(entity)
relation_queue.task_done()
```
## Performance Impact
### Startup Time
**Before (blocking):**
```
Knowledge Base Size Startup Time
------------------- ------------
100 entities 2 seconds
500 entities 8 seconds
1000 entities 18 seconds
2000 entities 35 seconds
5000 entities 90+ seconds
```
**After (non-blocking):**
```
Knowledge Base Size Startup Time Background Completion
------------------- ------------ ---------------------
100 entities <1 second 1 second
500 entities <1 second 3 seconds
1000 entities <1 second 5 seconds
2000 entities <1 second 10 seconds
5000 entities <1 second 25 seconds
```
### First Request Latency
**Before:**
- Cold start: **Wait for full initialization (10-90s)**
- First request: After initialization completes
**After:**
- Cold start: **Instant (<1s)**
- First request: Immediate (relations resolved on-demand if needed)
## User Experience Improvements
### Claude Desktop Integration
**Before:**
```
User: Ask Claude a question using Basic Memory
Claude: [Connecting... 30 seconds]
Claude: [Finally responds]
```
**After:**
```
User: Ask Claude a question using Basic Memory
Claude: [Instantly responds]
Claude: [Relations resolve in background]
```
### MCP Inspector
**Before:**
```bash
$ bm mcp inspect
Connecting...
Waiting...
Still waiting...
Connected! (after 25 seconds)
```
**After:**
```bash
$ bm mcp inspect
Connected! (instant)
> list_tools
[Tools listed immediately]
```
### Large Knowledge Bases
**Scenario:** 5000-note knowledge base
**Before:**
- 90+ second startup
- Unresponsive during init
- Timeouts on slow machines
**After:**
- <1 second startup
- Instant responsiveness
- Relations resolve while working
## Configuration
### Thread Pool Size
```json
// ~/.basic-memory/config.json
{
"sync_thread_pool_size": 4 // Number of background workers
}
```
**Recommendations:**
| Knowledge Base Size | Recommended Threads |
|---------------------|---------------------|
| < 1000 entities | 2-4 threads |
| 1000-5000 entities | 4-8 threads |
| 5000+ entities | 8-16 threads |
### Environment Variable
```bash
# Override thread pool size
export BASIC_MEMORY_SYNC_THREAD_POOL_SIZE=8
# Use more threads for large KB
bm mcp
```
### Disable Background Processing (Not Recommended)
```python
# For debugging only - blocks startup
BASIC_MEMORY_SYNC_THREAD_POOL_SIZE=0 # Synchronous (slow)
```
## On-Demand Resolution
### Lazy Relation Loading
If relations aren't resolved yet, they're resolved on first access:
```python
# Request for entity with unresolved relations
entity = await read_note("My Note")
if not entity.relations_resolved:
# Resolve on-demand (fast, single entity)
await resolve_entity_relations(entity)
return entity
```
**Result:** Fast queries even before background processing completes
### Cache-Aware Resolution
```python
# Check if already resolved
if entity.id in resolved_cache:
return entity # ← Fast: already resolved
# Resolve if needed
await resolve_entity_relations(entity)
resolved_cache.add(entity.id)
```
## Monitoring
### Background Processing Status
```python
from basic_memory.sync import sync_service
# Check background queue status
status = await sync_service.get_resolution_status()
print(f"Queued: {status.queued}")
print(f"Completed: {status.completed}")
print(f"In progress: {status.in_progress}")
```
### Logging
Enable debug logging to see background processing:
```bash
export BASIC_MEMORY_LOG_LEVEL=DEBUG
bm mcp
# Output:
# [DEBUG] Queued 2000 entities for background resolution
# [DEBUG] Background worker 1: processing entity_123
# [DEBUG] Background worker 2: processing entity_456
# [DEBUG] Completed 500/2000 entities
# [DEBUG] Background resolution complete
```
## Edge Cases
### Circular Relations
**Handled gracefully:**
```python
# Entity A → Entity B → Entity A (circular)
# Detection
visited = set()
if entity.id in visited:
# Skip to avoid infinite loop
return
visited.add(entity.id)
```
### Missing Targets
**Forward references resolved when targets exist:**
```python
# Entity A references Entity B (not yet created)
# Now: Forward reference (unresolved)
relation.target_id = None
# Later: Entity B created
# Background: Re-resolve Entity A
relation.target_id = entity_b.id # ← Now resolved
```
### Concurrent Updates
**Thread-safe processing:**
```python
# Multiple workers process safely
async with entity_lock:
await resolve_entity_relations(entity)
```
## Troubleshooting
### Slow Background Processing
**Problem:** Background resolution taking too long
**Solutions:**
1. **Increase thread pool size:**
```json
{"sync_thread_pool_size": 8}
```
2. **Check system resources:**
```bash
# Monitor CPU/memory
top
# Look for basic-memory processes
```
3. **Optimize database:**
```bash
# Ensure WAL mode enabled
sqlite3 ~/.basic-memory/memory.db "PRAGMA journal_mode;"
```
### Relations Not Resolving
**Problem:** Relations still unresolved after startup
**Check:**
```python
# Verify background processing running
from basic_memory.sync import sync_service
status = await sync_service.get_resolution_status()
print(status)
```
**Solution:**
```bash
# Restart MCP server
# Background processing should resume
```
### Memory Usage
**Problem:** High memory with large knowledge base
**Monitor:**
```bash
# Check memory usage
ps aux | grep basic-memory
# If high, reduce thread pool
export BASIC_MEMORY_SYNC_THREAD_POOL_SIZE=2
```
## Best Practices
### 1. Set Appropriate Thread Pool Size
```json
// For typical use (1000-5000 notes)
{"sync_thread_pool_size": 4}
// For large knowledge bases (5000+ notes)
{"sync_thread_pool_size": 8}
```
### 2. Don't Block on Resolution
```python
# ✓ Good: Let background processing happen
entity = await read_note("Note")
# Relations resolve automatically
# ✗ Bad: Don't wait for background queue
await wait_for_all_relations() # Defeats the purpose
```
### 3. Monitor Background Status
```python
# Check status for large operations
if knowledge_base_size > 1000:
status = await get_resolution_status()
logger.info(f"Background: {status.completed}/{status.total}")
```
### 4. Use Appropriate Logging
```bash
# Development: Debug logging
export BASIC_MEMORY_LOG_LEVEL=DEBUG
# Production: Info logging
export BASIC_MEMORY_LOG_LEVEL=INFO
```
## Technical Implementation
### Queue-Based Architecture
```python
class RelationResolutionService:
def __init__(self, thread_pool_size: int = 4):
self.queue = asyncio.Queue()
self.workers = []
# Start background workers
for i in range(thread_pool_size):
worker = asyncio.create_task(self._worker(i))
self.workers.append(worker)
async def _worker(self, worker_id: int):
while True:
entity = await self.queue.get()
try:
await self._resolve_entity(entity)
finally:
self.queue.task_done()
async def queue_entity(self, entity):
await self.queue.put(entity)
async def wait_completion(self):
await self.queue.join()
```
### Integration Points
**MCP Server Initialization:**
```python
async def initialize_mcp_server():
# Load entities
entities = await load_all_entities()
# Queue for background resolution
resolution_service.queue_entities(entities)
# Return immediately (don't wait)
return server
```
**On-Demand Resolution:**
```python
async def get_entity_with_relations(entity_id: str):
entity = await get_entity(entity_id)
if not entity.relations_resolved:
# Resolve on-demand if not done yet
await resolution_service.resolve_entity(entity)
return entity
```
## See Also
- `sqlite-performance.md` - Database-level optimizations
- `api-performance.md` - API-level optimizations (SPEC-11)
- Thread pool configuration documentation
- MCP server architecture documentation