pruning-implementation.md•7.96 kB
# Smart Memory Pruning Implementation
This document describes the implementation of Step 6: Smart Pruning with Information-Gain in the Titan Memory Architecture.
## Overview
The smart pruning system implements an information-gain based approach to intelligently reduce memory usage while maintaining high-quality memories. The system uses a composite scoring function that combines entropy, surprise, and redundancy measurements to determine which memories to keep.
## Architecture
### Core Components
1. **MemoryPruner Class** (`src/pruning.ts`)
- Main pruning engine with configurable parameters
- Implements information-gain scoring algorithm
- Handles memory distillation and validation
2. **MCP Integration** (`src/index.ts`)
- `prune_memory` tool endpoint for manual pruning
- Automatic pruning based on capacity thresholds
- Statistics reporting and monitoring
3. **Model Integration** (`src/model.ts`)
- `pruneMemoryByInformationGain()` method
- Automatic pruning checks during memory operations
- Configuration and statistics access
## Scoring Algorithm
The pruning system uses a composite score for each memory:
```
Score = (entropy × entropy_weight) + (surprise × surprise_weight) - (redundancy × redundancy_weight)
```
### Components
#### 1. Entropy Calculation
- Converts memory vectors to probability distributions using softmax
- Calculates Shannon entropy: `-∑(p * log(p))`
- Higher entropy indicates more information content
#### 2. Surprise Measurement
- Uses stored surprise history from memory updates
- Represents how unexpected the memory was when first stored
- Higher surprise indicates more valuable memories
#### 3. Redundancy Detection
- Computes pairwise cosine similarity between all memories
- Takes maximum similarity with other memories as redundancy score
- Higher redundancy indicates less unique information
## Configuration
The system supports extensive configuration through the `PruningConfig` interface:
```typescript
interface PruningConfig {
keepPercentage: number; // 0.7 = keep 70% of memories
minMemoriesToKeep: number; // Never prune below this threshold
maxCapacity: number; // Trigger automatic pruning at this size
entropyWeight: number; // Weight for entropy in scoring (1.0)
surpriseWeight: number; // Weight for surprise in scoring (1.2)
redundancyWeight: number; // Weight for redundancy penalty (0.8)
redundancyThreshold: number; // Similarity threshold for redundancy (0.85)
enableDistillation: boolean; // Whether to distill pruned memories (true)
}
```
## Memory Distillation
When enabled, pruned memories are not simply discarded but are distilled into long-term memory:
1. **Weighted Averaging**: Pruned memories are combined using access counts as weights
2. **Long-term Integration**: Distilled vectors are added to long-term memory
3. **Capacity Management**: Long-term memory is truncated if it exceeds 30% of total capacity
## Triggers
Pruning can be triggered in multiple ways:
### 1. Automatic Capacity-Based
- Triggers when memory usage exceeds `maxCapacity`
- Checked during memory store operations
- Maintains system performance and memory limits
### 2. Manual MCP Command
- `prune_memory` tool with optional threshold parameter
- Allows fine-grained control over pruning intensity
- Returns detailed statistics about the pruning operation
### 3. Programmatic API
- `pruneMemoryByInformationGain(threshold?)` method
- Direct integration with other system components
- Supports custom thresholds and configurations
## Validation and Quality Assurance
The system includes comprehensive validation:
### State Validation
- Checks tensor shape consistency after pruning
- Validates that all tensors have matching dimensions
- Detects NaN and infinite values in memory state
### Quality Analysis
- Compares entropy and surprise metrics before/after pruning
- Calculates improvement ratios for quality assessment
- Provides feedback on pruning effectiveness
## Performance Characteristics
### Time Complexity
- Information scoring: O(n²) for redundancy calculation
- Sorting and selection: O(n log n)
- Memory creation: O(n × d) where d is embedding dimension
### Memory Usage
- Temporary tensors for calculations are managed with `tf.tidy()`
- Pruned memories are properly disposed to prevent memory leaks
- Configurable long-term memory limits prevent unbounded growth
## Statistics and Monitoring
The system provides comprehensive statistics:
```typescript
interface PruningStats {
totalPrunings: number; // Total number of pruning operations
averageReduction: number; // Average reduction ratio
lastPruningTime: number; // Timestamp of last pruning
timeSinceLastPruning: number; // Time since last pruning
shouldPrune: boolean; // Whether pruning is recommended
currentMemorySize: number; // Current active memory count
maxCapacity: number; // Maximum capacity threshold
}
```
## Testing
The implementation includes comprehensive unit tests covering:
- Basic pruning operations and logic
- Information gain calculations (entropy, surprise, redundancy)
- State validation and error handling
- Configuration management and statistics
- Edge cases (empty states, single memories, extreme configurations)
- Performance testing with large memory states
### Test Coverage
- 26 test cases covering all major functionality
- Tests for both normal operation and edge cases
- Performance benchmarks for large-scale pruning
- Memory quality improvement validation
## Usage Examples
### Basic Pruning
```typescript
const pruner = new MemoryPruner({
keepPercentage: 0.8,
enableDistillation: true
});
const result = await pruner.pruneMemory(memoryState);
console.log(`Reduced from ${result.originalCount} to ${result.finalCount} memories`);
```
### MCP Command
```bash
# Prune memory keeping 60% of memories
prune_memory --threshold 0.6
# Force pruning regardless of current capacity
prune_memory --force true
```
### Model Integration
```typescript
// Automatic pruning when needed
const result = await model.pruneMemoryByInformationGain();
// Get pruning statistics
const stats = model.getPruningStats();
```
## Benefits
1. **Quality Preservation**: Keeps high-information memories while removing redundant ones
2. **Configurable**: Extensive configuration options for different use cases
3. **Efficient**: Optimized algorithms with proper memory management
4. **Validated**: Comprehensive testing and state validation
5. **Integrated**: Seamless integration with existing memory architecture
6. **Monitored**: Detailed statistics and quality metrics
## Future Enhancements
Potential improvements for the pruning system:
1. **Adaptive Thresholds**: Dynamic adjustment of scoring weights based on memory usage patterns
2. **Hierarchical Pruning**: Different pruning strategies for different memory tiers
3. **Temporal Considerations**: Age-based weighting in the scoring function
4. **Semantic Clustering**: Group similar memories before pruning decisions
5. **Performance Optimization**: GPU acceleration for large-scale pruning operations
## Conclusion
The smart pruning implementation successfully addresses the requirements of Step 6, providing:
- ✅ Information-gain based scoring (entropy × surprise - redundancy)
- ✅ Configurable retention percentage with quality-based selection
- ✅ Memory distillation into long-term storage
- ✅ Multiple trigger mechanisms (capacity-based and MCP command)
- ✅ Comprehensive unit tests with slot reduction and recall quality verification
- ✅ Performance optimization and memory management
- ✅ Statistics and monitoring capabilities
The system maintains the balance between memory efficiency and information retention, ensuring that the most valuable memories are preserved while redundant information is efficiently removed.