Skip to main content
Glama
DETAILED_IMPLEMENTATION_GUIDE.md53.4 kB
# MCP-Titan Detailed Implementation Guide **Version:** 3.0.0 **Date:** October 2025 **Status:** Remaining To-Dos with Full Context --- ## 🎯 Research Paper Foundation This guide implements the complete **Titans (Training at Test Time with Attention for Neural Memory Systems)** architecture as described in the research paper. All implementations are backed by specific equations, theorems, and architectural decisions from the paper. ### Core Titans Concepts (from research_paper_source.md) **What Makes Titans Unique (lines 381-386, 447-456):** 1. **Neural Memory Module (not matrix-valued)** - Expressive power to compress complex information 2. **Forgetting Mechanism** - Prevents fast memory overflow 3. **Token Flow Tracking** - Beyond momentary surprise to capture information flow 4. **Momentum-Based Updates** - Combines past and momentary surprise 5. **Deep Memory** - Multi-layer neural memory vs shallow linear memory 6. **Non-Linear Recurrence** - Inter-chunk non-linear, intra-chunk linear **Key Differentiators from Prior Work (lines 381-404):** - vs RMT: Neural memory module instead of vector-valued small size memory - vs DeltaNet/Gated DeltaNet: Momentum-based rule + deep memory + non-linear recurrence + forget gate - vs Longhorn: Forgetting gate for better memory management - vs TTT Layers: Forgetting mechanism + momentum-based updates --- ## Table of Contents 1. [Momentum Integration (Priority 1)](#1-momentum-integration) - **Research Paper: Appendix C, Equations 32-33** 2. [Token Flow Integration (Priority 1)](#2-token-flow-integration) - **Research Paper: Section 3.1, Lines 364-366** 3. [Forgetting Gate (Priority 1)](#3-forgetting-gate-implementation) - **Research Paper: Lines 472-476** 4. [Deep Neural Memory (Priority 2)](#4-deep-neural-memory-module) - **Research Paper: Lines 22-26, 450-452** 5. [Hierarchical Memory (Priority 2)](#5-hierarchical-memory-activation) - **Research Paper: Lines 381-386** 6. [Health Check Endpoint (Priority 1)](#6-health-check-endpoint) 7. [Structured Logging System (Priority 1)](#7-structured-logging-system) 8. [Performance Optimization (Priority 2)](#8-performance-optimization) - **Research Paper: Lines 256-270** 9. [Workflow Components Cleanup (Priority 2)](#9-workflow-components-cleanup) 10. [Response Caching (Priority 3)](#10-response-caching) 11. [Advanced Security Features (Priority 3)](#11-advanced-security-features) --- ## 1. Momentum Integration (Priority 1) ### Research Paper Reference **File:** `research_paper_source.md` lines 426-489 **Key Equations:** ``` M_t = diag(1-α_t)M_t + S_t (Equation 32) S_t = diag(η_t)S_{t-1} - diag(θ_t)(M_{t-1}k_t^T k_t - v_t^T k_t) (Equation 33) ``` **Paper Context (lines 447-450):** > "Momentum-based Rule: The Delta Rule is based on momentary surprise, meaning that the flow of tokens cannot affect the memory update rule. LMM, however, is based on a momentum rule, which considers both past and momentary surprise." ### Current Status **File:** `src/types.ts` lines 95-99 **Status:** ✅ Infrastructure ready, ⏳ Integration pending ```typescript // Already defined in IMemoryState: momentumState?: ITensor; // S_t momentumDecay?: number; // η_t forgettingGate?: ITensor; // α_t ``` ### Implementation Steps #### Step 1.1: Add Momentum Update to trainStep() **File to Modify:** `src/model.ts` **Method:** `trainStep()` (search for "public trainStep" or "trainStep(") **Current Location:** Lines ~2000-2100 (approximate) **Implementation:** ```typescript public trainStep(x_t: ITensor, x_next: ITensor, state: IMemoryState): { loss: ITensor; gradients: IModelGradients; memoryUpdate: IMemoryUpdateResult; } { return this.withErrorHandling('trainStep', () => { return tf.tidy(() => { const currentInput = unwrapTensor(x_t); const nextInput = unwrapTensor(x_next); // Current forward pass const { predicted, memoryUpdate } = this.forward(x_t, state); // Compute loss const predictionLoss = tf.mean( tf.squaredDifference(unwrapTensor(predicted), nextInput) ); // NEW: Momentum-based memory update (Equation 32-33) let updatedMemoryState = memoryUpdate.newState; if (this.config.enableMomentum && state.momentumState) { const momentum = this.computeMomentumUpdate( state, currentInput, nextInput, memoryUpdate ); updatedMemoryState = this.applyMomentumToMemory( memoryUpdate.newState, momentum ); } // NEW: Apply forgetting gate if enabled if (this.config.enableForgettingGate && state.forgettingGate) { updatedMemoryState = this.applyForgettingGate( updatedMemoryState, state.forgettingGate ); } // Continue with existing training logic... const optimizer = this.optimizer; const gradients = tf.variableGrads(() => { return predictionLoss; }); // Apply gradients optimizer.applyGradients(gradients.grads); return { loss: wrapTensor(predictionLoss), gradients: { shortTerm: wrapTensor(tf.zeros([1])), // Placeholder longTerm: wrapTensor(tf.zeros([1])), meta: wrapTensor(tf.zeros([1])) }, memoryUpdate: { ...memoryUpdate, newState: updatedMemoryState } }; }); }); } ``` #### Step 1.2: Implement computeMomentumUpdate() **Add to:** `src/model.ts` (private method section) **Reference:** `research_paper_source.md` lines 432-434 ```typescript /** * Computes momentum update term S_t according to Equation 33 * S_t = diag(η_t)S_{t-1} - diag(θ_t)(M_{t-1}k_t^T k_t - v_t^T k_t) */ private computeMomentumUpdate( state: IMemoryState, currentInput: tf.Tensor, nextInput: tf.Tensor, memoryUpdate: IMemoryUpdateResult ): tf.Tensor { return tf.tidy(() => { // Get momentum state S_{t-1} const S_prev = state.momentumState ? unwrapTensor(state.momentumState) : tf.zeros(unwrapTensor(state.shortTerm).shape); // Get momentum decay η_t const eta = state.momentumDecay ?? this.config.momentumDecayRate; // Get learning rate θ_t const theta = this.config.learningRate || 0.001; // Compute keys and values from attention const keys = memoryUpdate.attention.keys; const values = memoryUpdate.attention.values; // Compute M_{t-1}k_t^T k_t const M_prev = unwrapTensor(state.shortTerm); const k_t = unwrapTensor(keys); const v_t = unwrapTensor(values); // k_t^T k_t (outer product approximation) const k_outer = tf.mul(k_t, k_t); const M_k = tf.matMul(M_prev, k_outer.reshape([k_outer.shape[0], 1])); // v_t^T k_t (dot product) const v_k = tf.mul(v_t, k_t); // Gradient term: M_{t-1}k_t^T k_t - v_t^T k_t const gradient = tf.sub(M_k, v_k.reshape(M_k.shape)); // Momentum update: η_t * S_{t-1} - θ_t * gradient const momentum_decay = tf.mul(S_prev, eta); const gradient_term = tf.mul(gradient, -theta); const S_t = tf.add(momentum_decay, gradient_term); return S_t as tf.Tensor2D; }); } ``` #### Step 1.3: Implement applyMomentumToMemory() **Add to:** `src/model.ts` (private method section) **Reference:** `research_paper_source.md` line 432 ```typescript /** * Applies momentum update to memory state according to Equation 32 * M_t = diag(1-α_t)M_t + S_t */ private applyMomentumToMemory( memoryState: IMemoryState, momentum: tf.Tensor ): IMemoryState { return tf.tidy(() => { const M_current = unwrapTensor(memoryState.shortTerm); const S_t = momentum; // Default forgetting: no forgetting (α_t = 0) // If forgetting gate enabled, it's applied separately const M_t = tf.add(M_current, S_t); return { ...memoryState, shortTerm: wrapTensor(M_t), momentumState: wrapTensor(S_t) // Store S_t for next iteration }; }); } ``` #### Step 1.4: Implement applyForgettingGate() **Add to:** `src/model.ts` (private method section) **Reference:** `research_paper_source.md` lines 472-476 ```typescript /** * Applies learnable forgetting gate to memory * M_t = diag(1-α_t)M_t */ private applyForgettingGate( memoryState: IMemoryState, forgettingGate: ITensor ): IMemoryState { return tf.tidy(() => { const M = unwrapTensor(memoryState.shortTerm); const alpha = unwrapTensor(forgettingGate); // Expand alpha to match memory shape if needed let alphaExpanded = alpha; if (alpha.rank === 1 && M.rank === 2) { alphaExpanded = alpha.reshape([alpha.shape[0], 1]); alphaExpanded = tf.tile(alphaExpanded, [1, M.shape[1]]); } // M_t = (1 - α_t) * M_t const oneMinusAlpha = tf.sub(1, alphaExpanded); const M_forgotten = tf.mul(M, oneMinusAlpha); return { ...memoryState, shortTerm: wrapTensor(M_forgotten), longTerm: memoryState.longTerm // Long-term memory not affected by forgetting }; }); } ``` #### Step 1.5: Make Forgetting Gate Trainable (Optional Advanced) **File:** `src/model.ts` constructor or initialize method **Reference:** `research_paper_source.md` lines 472-476 ```typescript // In initialize() or initializeMemoryState() if (this.config.enableForgettingGate) { // Make forgetting gate a trainable variable const alphaInit = tf.fill([this.config.memorySlots], this.config.forgettingGateInit); this.forgettingGateVariable = tf.variable(alphaInit, true, 'forgetting_gate'); // Update memory state to use trainable variable this.memoryState.forgettingGate = wrapTensor(this.forgettingGateVariable); } ``` #### Step 1.6: Testing Momentum Integration **File:** `src/__tests__/momentum.test.ts` (create new file) ```typescript import * as tf from '@tensorflow/tfjs-node'; import { TitanMemoryModel } from '../model.js'; import { wrapTensor } from '../types.js'; describe('Momentum-Based Memory Updates', () => { let model: TitanMemoryModel; beforeEach(async () => { model = new TitanMemoryModel(); await model.initialize({ inputDim: 128, memorySlots: 100, enableMomentum: true, momentumDecayRate: 0.9, enableForgettingGate: false }); }); test('should initialize momentum state', () => { const state = model.getMemoryState(); expect(state.momentumState).toBeDefined(); expect(state.momentumDecay).toBe(0.9); }); test('should update momentum during training', async () => { const x_t = wrapTensor(tf.randomNormal([128])); const x_next = wrapTensor(tf.randomNormal([128])); const state = model.getMemoryState(); const result = model.trainStep(x_t, x_next, state); expect(result.memoryUpdate.newState.momentumState).toBeDefined(); // Momentum should change after training const newMomentum = result.memoryUpdate.newState.momentumState; expect(tf.equal(newMomentum, state.momentumState).all().dataSync()[0]).toBe(0); }); test('should apply momentum decay correctly', () => { // Test η_t parameter affects momentum // Implementation specific to your computeMomentumUpdate logic }); }); ``` --- ## 2. Token Flow Integration (Priority 1) ### Research Paper Reference **File:** `research_paper_source.md` lines 477-479 **Key Quote:** > "Momentum-based Update Rule: TTT layers are based on momentary surprise, meaning that the flow of tokens cannot affect the memory update rule. LMM, however, is based on a momentum rule, which consider both past and momentary surprise." ### Audit Reference **File:** `mcp-titan-system-audit.plan.md` lines 16-20 ### Current Status **File:** `src/types.ts` lines 100-103 **Status:** ✅ Infrastructure ready, ⏳ Integration pending ```typescript // Already defined: tokenFlowHistory?: ITensor; // Sequential token tracking flowWeights?: ITensor; // Flow contribution weights ``` ### Implementation Steps #### Step 2.1: Update Token Flow in Forward Pass **File to Modify:** `src/model.ts` **Method:** `forward()` (search for "public forward") ```typescript public forward(input: ITensor, state?: IMemoryState): { predicted: ITensor; memoryUpdate: IMemoryUpdateResult; } { return this.withErrorHandling('forward', () => { return tf.tidy(() => { const memoryState = state ?? this.memoryState; const inputTensor = unwrapTensor(input); // NEW: Update token flow history let updatedFlowHistory = memoryState.tokenFlowHistory; let updatedFlowWeights = memoryState.flowWeights; if (this.config.enableTokenFlow && memoryState.tokenFlowHistory) { const flowUpdate = this.updateTokenFlow( memoryState.tokenFlowHistory, memoryState.flowWeights, inputTensor ); updatedFlowHistory = flowUpdate.history; updatedFlowWeights = flowUpdate.weights; } // Existing forward pass logic... const inputVector = inputTensor.reshape([1, inputTensor.shape[0]]); const encodedInput = this.encoder.predict(inputVector) as tf.Tensor2D; // Compute attention const attention = this.computeMemoryAttention(encodedInput); // NEW: Weight surprise by token flow let surprise = this.computeSurprise(encodedInput, attention); if (this.config.enableTokenFlow && updatedFlowWeights) { surprise = this.weightSurpriseByTokenFlow( surprise, updatedFlowWeights ); } // Continue with existing logic... const predicted = this.decoder.predict(attention.values); return { predicted: wrapTensor(predicted), memoryUpdate: { newState: { ...memoryState, shortTerm: wrapTensor(attention.values), surpriseHistory: wrapTensor(surprise.accumulated), tokenFlowHistory: updatedFlowHistory, flowWeights: updatedFlowWeights }, attention, surprise } }; }); }); } ``` #### Step 2.2: Implement updateTokenFlow() **Add to:** `src/model.ts` (private method section) ```typescript /** * Updates token flow history with new input * Implements sliding window of recent tokens */ private updateTokenFlow( currentHistory: ITensor | undefined, currentWeights: ITensor | undefined, newToken: tf.Tensor ): { history: ITensor; weights: ITensor } { return tf.tidy(() => { const history = currentHistory ? unwrapTensor(currentHistory) : tf.zeros([this.config.tokenFlowWindow, newToken.shape[0]]); const weights = currentWeights ? unwrapTensor(currentWeights) : tf.zeros([this.config.tokenFlowWindow]); // Roll history (shift all tokens by one position) const historyArray = history.arraySync() as number[][]; historyArray.shift(); // Remove oldest historyArray.push(Array.from(newToken.dataSync())); // Add newest const newHistory = tf.tensor2d(historyArray); // Compute new weights based on recency and similarity const newWeights = this.computeTokenFlowWeights(newHistory, newToken); return { history: wrapTensor(newHistory), weights: wrapTensor(newWeights) }; }); } ``` #### Step 2.3: Implement computeTokenFlowWeights() **Add to:** `src/model.ts` (private method section) ```typescript /** * Computes weights for token flow contribution * Recent tokens get higher weight, with decay */ private computeTokenFlowWeights( flowHistory: tf.Tensor, currentToken: tf.Tensor ): tf.Tensor { return tf.tidy(() => { const windowSize = this.config.tokenFlowWindow; // Recency weights: exponential decay const recencyWeights = tf.range(0, windowSize, 1) .div(windowSize) .sub(1) .abs(); // [1.0, 0.9, 0.8, ..., 0.1] // Similarity weights: cosine similarity with current token const currentExpanded = currentToken.reshape([1, currentToken.shape[0]]); const similarities = tf.matMul(flowHistory, currentExpanded, false, true) .squeeze(); const normalizedSim = tf.sigmoid(similarities); // Normalize to [0, 1] // Combined weights: 50% recency, 50% similarity const combinedWeights = tf.add( tf.mul(recencyWeights, 0.5), tf.mul(normalizedSim, 0.5) ); // Normalize to sum to 1 const sumWeights = tf.sum(combinedWeights); const normalizedWeights = tf.div(combinedWeights, sumWeights); return normalizedWeights as tf.Tensor1D; }); } ``` #### Step 2.4: Implement weightSurpriseByTokenFlow() **Add to:** `src/model.ts` (private method section) ```typescript /** * Weights surprise metric by token flow contribution * Combines momentary surprise with flow-based surprise */ private weightSurpriseByTokenFlow( surprise: ISurpriseMetrics, flowWeights: ITensor ): ISurpriseMetrics { return tf.tidy(() => { const immediateValue = unwrapTensor(surprise.immediate); const weights = unwrapTensor(flowWeights); // Flow-weighted surprise: weight recent surprise by flow const flowContribution = tf.sum(tf.mul(weights, immediateValue)); // Combined surprise: 50% momentary, 50% flow const totalSurprise = tf.add( tf.mul(immediateValue, 0.5), tf.mul(flowContribution, 0.5) ); return { immediate: wrapTensor(immediateValue), accumulated: wrapTensor(totalSurprise), totalSurprise: wrapTensor(totalSurprise) }; }); } ``` #### Step 2.5: Add Token Flow Metrics Tool **File to Modify:** `src/index.ts` **Add new MCP tool:** ```typescript // Add after get_memory_state tool this.server.tool( 'get_token_flow_metrics', "Get token flow analysis and statistics", {}, async () => { await this.ensureInitialized(); try { if (!this.memoryState.tokenFlowHistory || !this.memoryState.flowWeights) { return { content: [{ type: "text", text: "Token flow tracking not enabled. Initialize with enableTokenFlow: true" }] }; } const history = unwrapTensor(this.memoryState.tokenFlowHistory); const weights = unwrapTensor(this.memoryState.flowWeights); const metrics = { windowSize: history.shape[0], averageWeight: tf.mean(weights).dataSync()[0], maxWeight: tf.max(weights).dataSync()[0], minWeight: tf.min(weights).dataSync()[0], flowStrength: tf.sum(weights).dataSync()[0], historySize: history.shape[0] }; return { content: [{ type: "text", text: `Token Flow Metrics:\n${JSON.stringify(metrics, null, 2)}` }] }; } catch (error) { const message = error instanceof Error ? error.message : 'Unknown error'; return { content: [{ type: "text", text: `Failed to get token flow metrics: ${message}` }] }; } } ); ``` --- ## 3. Hierarchical Memory Activation (Priority 2) ### Research Paper Reference **File:** `research_paper_source.md` lines 376-392 **Key Concepts:** - Working memory → Short-term → Long-term promotion - Access-based promotion rules - Time-based demotion ### Audit Reference **File:** `mcp-titan-system-audit.plan.md` lines 85-89 ### Current Status **File:** `src/model.ts` lines 254-276, 894-929 **Status:** ⚠️ Defined but NOT activated ```typescript // Already defined in src/model.ts: private promotionRules: IMemoryPromotionRules = { workingToShortTerm: { accessThreshold: 3, timeThreshold: 30000 }, shortTermToLongTerm: { accessThreshold: 5, timeThreshold: 300000 }, // ... fully configured }; ``` ### Implementation Steps #### Step 3.1: Activate Hierarchical Memory in Forward Pass **File to Modify:** `src/model.ts` **Method:** `forward()` - add after memory update ```typescript // In forward() method, after computing memory update: if (this.config.useHierarchicalMemory || this.config.enableHierarchicalMemory) { const promoted = this.applyMemoryPromotion(memoryUpdate.newState); memoryUpdate.newState = promoted; // Track promotion statistics this.updatePromotionStats(); } ``` #### Step 3.2: Implement applyMemoryPromotion() **Add to:** `src/model.ts` (private method section) ```typescript /** * Applies hierarchical memory promotion/demotion rules * Working → Short-term → Long-term based on access patterns */ private applyMemoryPromotion(state: IMemoryState): IMemoryState { return tf.tidy(() => { const currentTime = Date.now(); const timestamps = unwrapTensor(state.timestamps).arraySync() as number[]; const accessCounts = unwrapTensor(state.accessCounts).arraySync() as number[]; // Identify memories to promote from working to short-term const toPromote = timestamps.map((ts, idx) => { const age = currentTime - ts; const accesses = accessCounts[idx]; const rules = this.promotionRules.workingToShortTerm; return accesses >= rules.accessThreshold && age >= rules.timeThreshold; }); // Identify memories to promote from short-term to long-term const toLongTerm = timestamps.map((ts, idx) => { const age = currentTime - ts; const accesses = accessCounts[idx]; const rules = this.promotionRules.shortTermToLongTerm; return accesses >= rules.accessThreshold && age >= rules.timeThreshold && toPromote[idx]; // Must already be in short-term }); // Apply promotions let newState = { ...state }; // Move qualifying memories to long-term if (toLongTerm.some(v => v)) { newState = this.promoteToLongTerm(newState, toLongTerm); this.memoryStats.promotions.total += toLongTerm.filter(v => v).length; this.memoryStats.promotions.recent += toLongTerm.filter(v => v).length; } // Apply age-based demotion newState = this.applyAgeDemotion(newState, currentTime); return newState; }); } ``` #### Step 3.3: Implement promoteToLongTerm() **Add to:** `src/model.ts` ```typescript /** * Promotes memories from short-term to long-term storage */ private promoteToLongTerm( state: IMemoryState, promoteFlags: boolean[] ): IMemoryState { return tf.tidy(() => { const shortTerm = unwrapTensor(state.shortTerm).arraySync() as number[][]; const longTerm = unwrapTensor(state.longTerm).arraySync() as number[][]; // Extract memories to promote const toPromote = shortTerm.filter((_, idx) => promoteFlags[idx]); if (toPromote.length === 0) { return state; } // Add to long-term (with capacity management) const maxLongTerm = Math.floor(this.config.memorySlots / 2); const updatedLongTerm = [...toPromote, ...longTerm].slice(0, maxLongTerm); // Remove promoted memories from short-term const updatedShortTerm = shortTerm.filter((_, idx) => !promoteFlags[idx]); return { ...state, shortTerm: wrapTensor(tf.tensor2d(updatedShortTerm)), longTerm: wrapTensor(tf.tensor2d(updatedLongTerm)) }; }); } ``` #### Step 3.4: Implement applyAgeDemotion() **Add to:** `src/model.ts` ```typescript /** * Demotes or removes old, low-access memories */ private applyAgeDemotion( state: IMemoryState, currentTime: number ): IMemoryState { return tf.tidy(() => { const timestamps = unwrapTensor(state.timestamps).arraySync() as number[]; const accessCounts = unwrapTensor(state.accessCounts).arraySync() as number[]; const demotionRules = this.promotionRules.demotionRules; // Calculate memory scores (higher = keep) const scores = timestamps.map((ts, idx) => { const age = currentTime - ts; const ageDecay = Math.pow(demotionRules.ageDecayRate, age / 1000); // Per second const accessBonus = accessCounts[idx] * (1 - demotionRules.lowAccessPenalty); return ageDecay * accessBonus; }); // Keep memories above forgetting threshold const keepFlags = scores.map(score => score > demotionRules.forgettingThreshold); // Apply filtering const shortTerm = unwrapTensor(state.shortTerm).arraySync() as number[][]; const filteredShortTerm = shortTerm.filter((_, idx) => keepFlags[idx]); const filteredTimestamps = timestamps.filter((_, idx) => keepFlags[idx]); const filteredAccessCounts = accessCounts.filter((_, idx) => keepFlags[idx]); const demotedCount = keepFlags.filter(v => !v).length; if (demotedCount > 0) { this.memoryStats.demotions.total += demotedCount; this.memoryStats.demotions.recent += demotedCount; } return { ...state, shortTerm: wrapTensor(tf.tensor2d(filteredShortTerm.length > 0 ? filteredShortTerm : [[0]])), // Prevent empty tensor timestamps: wrapTensor(tf.tensor1d(filteredTimestamps)), accessCounts: wrapTensor(tf.tensor1d(filteredAccessCounts)) }; }); } ``` #### Step 3.5: Add Hierarchical Memory Metrics Tool **File to Modify:** `src/index.ts` ```typescript this.server.tool( 'get_hierarchical_metrics', "Get hierarchical memory promotion/demotion statistics", {}, async () => { await this.ensureInitialized(); try { const stats = (this.model as any).memoryStats; const config = this.model.getConfig(); if (!config.useHierarchicalMemory && !config.enableHierarchicalMemory) { return { content: [{ type: "text", text: "Hierarchical memory not enabled. Initialize with enableHierarchicalMemory: true" }] }; } const metrics = { promotions: stats.promotions, demotions: stats.demotions, lastUpdate: new Date(stats.lastStatsUpdate).toISOString(), shortTermSize: unwrapTensor(this.memoryState.shortTerm).shape[0], longTermSize: unwrapTensor(this.memoryState.longTerm).shape[0] }; return { content: [{ type: "text", text: `Hierarchical Memory Metrics:\n${JSON.stringify(metrics, null, 2)}` }] }; } catch (error) { const message = error instanceof Error ? error.message : 'Unknown error'; return { content: [{ type: "text", text: `Failed to get hierarchical metrics: ${message}` }] }; } } ); ``` --- ## 4. Health Check Endpoint (Priority 1) ### Audit Reference **File:** `mcp-titan-system-audit.plan.md` lines 195-198 ### Implementation Steps #### Step 4.1: Add Health Check Endpoint **File to Modify:** `src/index.ts` **Add after other tool registrations:** ```typescript this.server.tool( 'health_check', "Get system health status and diagnostics", { detailed: z.boolean().optional().describe("Include detailed diagnostics") }, async (params) => { const detailed = params.detailed ?? false; try { const health = await this.performHealthCheck(detailed ? 'detailed' : 'quick'); return { content: [{ type: "text", text: JSON.stringify(health, null, 2) }] }; } catch (error) { const message = error instanceof Error ? error.message : 'Unknown error'; return { content: [{ type: "text", text: `Health check failed: ${message}` }] }; } } ); ``` #### Step 4.2: Implement performHealthCheck() Method **Add to:** `src/index.ts` (private method section) ```typescript private async performHealthCheck(level: 'quick' | 'detailed' = 'quick'): Promise<any> { const startTime = Date.now(); const health: any = { status: 'healthy', timestamp: new Date().toISOString(), uptime: process.uptime(), version: '3.0.0' }; try { // Check model initialization health.modelInitialized = this.isInitialized; if (!this.isInitialized) { health.status = 'degraded'; health.warnings = ['Model not initialized']; } // Check TensorFlow.js memory const tfMemory = tf.memory(); health.tensorflow = { numTensors: tfMemory.numTensors, numBytes: tfMemory.numBytes, numBytesInGPU: tfMemory.numBytesInGPU || 0 }; if (tfMemory.numTensors > 1000) { health.status = 'degraded'; health.warnings = health.warnings || []; health.warnings.push('High tensor count - possible memory leak'); } // Check Node.js memory const processMemory = process.memoryUsage(); health.process = { heapUsed: Math.round(processMemory.heapUsed / 1024 / 1024) + ' MB', heapTotal: Math.round(processMemory.heapTotal / 1024 / 1024) + ' MB', external: Math.round(processMemory.external / 1024 / 1024) + ' MB', rss: Math.round(processMemory.rss / 1024 / 1024) + ' MB' }; if (processMemory.heapUsed / processMemory.heapTotal > 0.9) { health.status = 'unhealthy'; health.errors = health.errors || []; health.errors.push('Heap memory usage > 90%'); } // Check memory state if (this.isInitialized) { const memStats = this.getMemoryStats(); health.memory = { capacity: `${(memStats.capacity * 100).toFixed(1)}%`, surpriseScore: memStats.surpriseScore.toFixed(4), shortTermMean: memStats.shortTermMean.toFixed(4), longTermMean: memStats.longTermMean.toFixed(4) }; } if (level === 'detailed') { // Add detailed diagnostics health.config = this.model?.getConfig(); health.features = { momentum: this.model?.getConfig().enableMomentum, tokenFlow: this.model?.getConfig().enableTokenFlow, forgettingGate: this.model?.getConfig().enableForgettingGate, hierarchical: this.model?.getConfig().enableHierarchicalMemory }; // Test operations try { const testInput = tf.randomNormal([128]); const testResult = this.model?.forward(wrapTensor(testInput), this.memoryState); testInput.dispose(); health.operations = { forward: 'ok' }; } catch (error) { health.operations = { forward: 'failed' }; health.status = 'unhealthy'; } } // Calculate response time health.responseTimeMs = Date.now() - startTime; } catch (error) { health.status = 'unhealthy'; health.errors = health.errors || []; health.errors.push((error as Error).message); } return health; } ``` #### Step 4.3: Add HTTP Health Endpoint (if HTTP server exists) **File:** Search for HTTP server setup in `src/index.ts` or `src/server.ts` **Add route:** ```typescript // If using Express or similar: app.get('/health', async (req, res) => { const detailed = req.query.detailed === 'true'; const health = await server.performHealthCheck(detailed ? 'detailed' : 'quick'); const statusCode = health.status === 'healthy' ? 200 : health.status === 'degraded' ? 200 : 503; res.status(statusCode).json(health); }); // Kubernetes-style endpoints app.get('/healthz', async (req, res) => { const health = await server.performHealthCheck('quick'); res.status(health.status === 'healthy' ? 200 : 503).send(health.status); }); app.get('/readyz', async (req, res) => { const ready = server.isInitialized && server.model !== null; res.status(ready ? 200 : 503).send(ready ? 'ready' : 'not ready'); }); ``` --- ## 5. Structured Logging System (Priority 1) ### Audit Reference **File:** `mcp-titan-system-audit.plan.md` lines 200-204 ### Implementation Steps #### Step 5.1: Create Logging Infrastructure **File to Create:** `src/logging.ts` ```typescript import * as fs from 'fs/promises'; import * as path from 'path'; export enum LogLevel { DEBUG = 0, INFO = 1, WARN = 2, ERROR = 3 } export interface LogEntry { timestamp: string; level: string; operation: string; message: string; metadata?: Record<string, any>; error?: { name: string; message: string; stack?: string; }; } export class StructuredLogger { private static instance: StructuredLogger; private logBuffer: LogEntry[] = []; private flushInterval?: NodeJS.Timeout; private logLevel: LogLevel = LogLevel.INFO; private logDir: string; private maxFileSize = 10 * 1024 * 1024; // 10MB private maxFiles = 5; private constructor(logDir: string) { this.logDir = logDir; this.startFlushInterval(); } public static getInstance(logDir?: string): StructuredLogger { if (!StructuredLogger.instance) { StructuredLogger.instance = new StructuredLogger( logDir || path.join(process.cwd(), '.titan_memory', 'logs') ); } return StructuredLogger.instance; } public setLogLevel(level: LogLevel): void { this.logLevel = level; } public debug(operation: string, message: string, metadata?: Record<string, any>): void { if (this.logLevel <= LogLevel.DEBUG) { this.log('DEBUG', operation, message, metadata); } } public info(operation: string, message: string, metadata?: Record<string, any>): void { if (this.logLevel <= LogLevel.INFO) { this.log('INFO', operation, message, metadata); } } public warn(operation: string, message: string, metadata?: Record<string, any>): void { if (this.logLevel <= LogLevel.WARN) { this.log('WARN', operation, message, metadata); } } public error(operation: string, message: string, error?: Error, metadata?: Record<string, any>): void { if (this.logLevel <= LogLevel.ERROR) { const errorData = error ? { name: error.name, message: error.message, stack: error.stack } : undefined; this.log('ERROR', operation, message, metadata, errorData); } } private log( level: string, operation: string, message: string, metadata?: Record<string, any>, error?: { name: string; message: string; stack?: string } ): void { const entry: LogEntry = { timestamp: new Date().toISOString(), level, operation, message, metadata, error }; // Console output for immediate visibility const consoleMsg = `[${entry.timestamp}] ${level} [${operation}]: ${message}`; switch (level) { case 'ERROR': console.error(consoleMsg, metadata, error); break; case 'WARN': console.warn(consoleMsg, metadata); break; case 'DEBUG': console.debug(consoleMsg, metadata); break; default: console.log(consoleMsg, metadata); } // Buffer for file writing this.logBuffer.push(entry); // Flush if buffer is large if (this.logBuffer.length >= 100) { this.flush().catch(err => console.error('Failed to flush logs:', err)); } } private startFlushInterval(): void { // Flush logs every 10 seconds this.flushInterval = setInterval(() => { this.flush().catch(err => console.error('Failed to flush logs:', err)); }, 10000); } public async flush(): Promise<void> { if (this.logBuffer.length === 0) { return; } try { await fs.mkdir(this.logDir, { recursive: true }); const today = new Date().toISOString().split('T')[0]; const logFile = path.join(this.logDir, `titan-${today}.log`); // Check file size and rotate if needed await this.rotateLogsIfNeeded(logFile); // Write buffered logs const logLines = this.logBuffer.map(entry => JSON.stringify(entry)).join('\n') + '\n'; await fs.appendFile(logFile, logLines, 'utf-8'); // Clear buffer this.logBuffer = []; } catch (error) { console.error('Failed to write logs:', error); } } private async rotateLogsIfNeeded(logFile: string): Promise<void> { try { const stats = await fs.stat(logFile); if (stats.size >= this.maxFileSize) { // Rotate: file.log -> file.1.log -> file.2.log -> ... for (let i = this.maxFiles - 1; i > 0; i--) { const oldFile = logFile.replace('.log', `.${i}.log`); const newFile = logFile.replace('.log', `.${i + 1}.log`); try { await fs.rename(oldFile, newFile); } catch { // File doesn't exist, skip } } // Rotate current to .1 await fs.rename(logFile, logFile.replace('.log', '.1.log')); } } catch (error) { if ((error as any).code !== 'ENOENT') { console.error('Failed to rotate logs:', error); } } } public async dispose(): Promise<void> { if (this.flushInterval) { clearInterval(this.flushInterval); } await this.flush(); } } ``` #### Step 5.2: Integrate Logging into Server **File to Modify:** `src/index.ts` **Add at top:** ```typescript import { StructuredLogger, LogLevel } from './logging.js'; ``` **In constructor:** ```typescript constructor(options: { memoryPath?: string } = {}) { this.server = new McpServer({...}); this.vectorProcessor = VectorProcessor.getInstance(); this.memoryPath = options.memoryPath ?? path.join(process.cwd(), '.titan_memory'); this.modelDir = path.join(this.memoryPath, 'model'); this.memoryState = this.initializeEmptyState(); // Initialize structured logging this.logger = StructuredLogger.getInstance(path.join(this.memoryPath, 'logs')); this.logger.setLogLevel(process.env.LOG_LEVEL === 'DEBUG' ? LogLevel.DEBUG : LogLevel.INFO); this.logger.info('server', 'TitanMemoryServer initialized', { memoryPath: this.memoryPath, version: '3.0.0' }); this.registerTools(); } ``` **In shutdown:** ```typescript private async shutdown(): Promise<void> { try { this.logger.info('server', 'Shutting down TitanMemoryServer'); // ... existing shutdown logic ... // Flush logs before exit await this.logger.dispose(); process.exit(0); } catch (error) { this.logger.error('server', 'Shutdown failed', error as Error); process.exit(1); } } ``` #### Step 5.3: Replace Console.log with Structured Logging **Throughout `src/index.ts` and `src/model.ts`, replace:** ```typescript // Old: console.log(`Memory initialized with ${memorySlots} slots`); console.error('Error during training:', error); // New: this.logger.info('memory', `Memory initialized with ${memorySlots} slots`, { memorySlots, embeddingSize }); this.logger.error('training', 'Training failed', error, { step: this.stepCount }); ``` --- ## 6. Performance Optimization (Priority 2) ### Audit Reference **File:** `mcp-titan-system-audit.plan.md` lines 256-270 ### Implementation Tasks #### Step 6.1: Eliminate Redundant Forward Passes **Problem:** `train_step` calls `forward()` which may call `forward()` again **File:** `src/model.ts` **Solution:** Cache forward pass results in training ```typescript // In trainStep(), reuse forward pass results: public trainStep(x_t: ITensor, x_next: ITensor, state: IMemoryState): { loss: ITensor; gradients: IModelGradients; memoryUpdate: IMemoryUpdateResult; } { return this.withErrorHandling('trainStep', () => { return tf.tidy(() => { // Do forward pass ONCE const forwardResult = this.forward(x_t, state); // Use cached results for loss computation const predicted = unwrapTensor(forwardResult.predicted); const nextInput = unwrapTensor(x_next); const loss = tf.mean(tf.squaredDifference(predicted, nextInput)); // Don't call forward() again - reuse forwardResult // Continue with gradient computation using cached results... return { loss: wrapTensor(loss), gradients: {...}, memoryUpdate: forwardResult.memoryUpdate // Reuse! }; }); }); } ``` #### Step 6.2: Implement LRU Cache for get_memory_state **File to Create:** `src/cache.ts` ```typescript export class LRUCache<K, V> { private cache: Map<K, { value: V; timestamp: number }>; private maxSize: number; private ttl: number; // Time to live in milliseconds constructor(maxSize: number = 100, ttl: number = 60000) { this.cache = new Map(); this.maxSize = maxSize; this.ttl = ttl; } get(key: K): V | undefined { const entry = this.cache.get(key); if (!entry) { return undefined; } // Check if expired if (Date.now() - entry.timestamp > this.ttl) { this.cache.delete(key); return undefined; } // Move to end (most recently used) this.cache.delete(key); this.cache.set(key, entry); return entry.value; } set(key: K, value: V): void { // Delete if exists (to update position) this.cache.delete(key); // Add to end this.cache.set(key, { value, timestamp: Date.now() }); // Evict oldest if over capacity if (this.cache.size > this.maxSize) { const firstKey = this.cache.keys().next().value; this.cache.delete(firstKey); } } clear(): void { this.cache.clear(); } invalidate(key: K): void { this.cache.delete(key); } } ``` **Integrate into `src/index.ts`:** ```typescript import { LRUCache } from './cache.js'; export class TitanMemoryServer { private memoryStateCache: LRUCache<string, any>; constructor(options: { memoryPath?: string } = {}) { // ... existing initialization ... // Initialize cache this.memoryStateCache = new LRUCache(50, 30000); // 50 entries, 30s TTL } // In get_memory_state tool: this.server.tool( 'get_memory_state', "Get current memory state statistics and information", { useCache: z.boolean().optional().describe("Use cached result if available") }, async (params) => { await this.ensureInitialized(); const useCache = params.useCache ?? true; const cacheKey = 'memory_state'; // Check cache first if (useCache) { const cached = this.memoryStateCache.get(cacheKey); if (cached) { return { content: [{ type: "text", text: cached + "\n(cached)" }] }; } } try { const stats = this.getMemoryStats(); const health = await this.performHealthCheck('quick'); const result = `Memory State:\n- Short-term mean: ${stats.shortTermMean.toFixed(4)}\n...`; // Cache result this.memoryStateCache.set(cacheKey, result); return { content: [{ type: "text", text: result }] }; } catch (error) { // ... error handling ... } } ); // Invalidate cache on memory updates private invalidateMemoryCache(): void { this.memoryStateCache.clear(); } // Call invalidateMemoryCache() after train_step, forward_pass, etc. } ``` #### Step 6.3: Use In-Place Tensor Operations **File:** `src/model.ts` **Search for:** `.clone()` operations and replace where safe ```typescript // Old (creates copy): const updated = currentTensor.add(delta).clone(); // New (in-place where possible): const updated = currentTensor.add(delta); // Don't clone unless needed // Or use tf.keep() for tensors that need to persist: const kept = tf.keep(currentTensor.add(delta)); ``` **Specific locations to optimize:** - Memory update operations in `forward()` - Gradient computations in `trainStep()` - Attention calculations in `computeMemoryAttention()` --- ## 7. Workflow Components Cleanup (Priority 2) ### Audit Reference **File:** `mcp-titan-system-audit.plan.md` lines 274-284 ### Investigation Required #### Step 7.1: Analyze Workflow Files **Files to Review:** 1. `src/workflows/WorkflowOrchestrator.ts` 2. `src/workflows/GitHubWorkflowManager.ts` 3. `src/workflows/LintingManager.ts` 4. `src/workflows/FeedbackProcessor.ts` 5. `src/workflows/WorkflowUtils.ts` **For Each File:** 1. Check if imported anywhere: `grep -r "from.*workflows" src/` 2. Check if used in tools: Search `src/index.ts` for references 3. Document purpose if unclear 4. Decide: Keep, Document, or Remove #### Step 7.2: Create Documentation or Remove **Option A: If Workflows Are Intended Features** Create `docs/workflows.md`: ```markdown # Workflow Components ## WorkflowOrchestrator - **Purpose**: Coordinates multi-step memory operations - **Status**: Experimental, not yet integrated - **Usage**: (document how to enable/use) ## GitHubWorkflowManager - **Purpose**: GitHub integration for memory persistence - **Status**: Experimental - **Usage**: (document) ## LintingManager - **Purpose**: Code quality checks for memory patterns - **Status**: Placeholder - **Usage**: (document) ## FeedbackProcessor - **Purpose**: User feedback loop for memory quality - **Status**: Not implemented - **Usage**: (document) ``` **Option B: If Workflows Are Obsolete** Remove files and update project structure documentation. --- ## 8. Response Caching (Priority 3) ### Audit Reference **File:** `mcp-titan-system-audit.plan.md` lines 245-248 ### Implementation (builds on Step 6.2) #### Step 8.1: Expand Caching to More Tools **Tools to Cache:** - `get_memory_state` (already done in Step 6.2) - `get_surprise_metrics` - `analyze_memory` - `get_token_flow_metrics` - `get_hierarchical_metrics` **Pattern for each tool:** ```typescript this.server.tool( 'cacheable_tool', "Description", { useCache: z.boolean().optional() }, async (params) => { const useCache = params.useCache ?? true; const cacheKey = `tool_name_${JSON.stringify(params)}`; if (useCache) { const cached = this.cache.get(cacheKey); if (cached) return cached; } const result = await computeExpensiveOperation(); this.cache.set(cacheKey, result); return result; } ); ``` #### Step 8.2: Add Response Compression **File:** `src/index.ts` **Install:** `npm install zlib` ```typescript import * as zlib from 'zlib'; import { promisify } from 'util'; const gzip = promisify(zlib.gzip); const gunzip = promisify(zlib.gunzip); // Add compression utility private async compressResponse(data: any): Promise<Buffer> { const json = JSON.stringify(data); return await gzip(json); } private async decompressResponse(buffer: Buffer): Promise<any> { const json = await gunzip(buffer); return JSON.parse(json.toString()); } // Use in large response tools this.server.tool( 'get_full_memory_dump', "Get complete memory dump (large response)", { compress: z.boolean().optional().describe("Compress response") }, async (params) => { const memoryDump = await this.getFullMemoryDump(); if (params.compress) { const compressed = await this.compressResponse(memoryDump); return { content: [{ type: "text", text: `Compressed data (${compressed.length} bytes). Use decompress tool to extract.`, data: compressed.toString('base64') }] }; } return { content: [{ type: "text", text: JSON.stringify(memoryDump, null, 2) }] }; } ); ``` --- ## 9. Advanced Security Features (Priority 3) ### Audit Reference **File:** `mcp-titan-system-audit.plan.md` lines 218-232 ### Implementation Steps #### Step 9.1: Checkpoint Encryption **File:** `src/index.ts` **Add encryption utilities:** ```typescript import * as crypto from 'crypto'; private readonly ENCRYPTION_KEY = process.env.TITAN_ENCRYPTION_KEY || crypto.randomBytes(32).toString('hex'); private readonly ENCRYPTION_IV_LENGTH = 16; private encryptData(data: string): { encrypted: string; iv: string } { const iv = crypto.randomBytes(this.ENCRYPTION_IV_LENGTH); const cipher = crypto.createCipheriv( 'aes-256-cbc', Buffer.from(this.ENCRYPTION_KEY, 'hex'), iv ); let encrypted = cipher.update(data, 'utf8', 'hex'); encrypted += cipher.final('hex'); return { encrypted, iv: iv.toString('hex') }; } private decryptData(encrypted: string, iv: string): string { const decipher = crypto.createDecipheriv( 'aes-256-cbc', Buffer.from(this.ENCRYPTION_KEY, 'hex'), Buffer.from(iv, 'hex') ); let decrypted = decipher.update(encrypted, 'hex', 'utf8'); decrypted += decipher.final('utf8'); return decrypted; } // Update save_checkpoint tool: this.server.tool( 'save_checkpoint', "Save current memory state to a checkpoint file", { path: z.string().describe("Path to save the checkpoint"), encrypt: z.boolean().optional().describe("Encrypt checkpoint data") }, async (params) => { await this.ensureInitialized(); try { const validatedPath = this.validateFilePath(params.path); const checkpointData = { // ... existing checkpoint data ... }; let dataToWrite = JSON.stringify(checkpointData, null, 2); let metadata: any = {}; if (params.encrypt) { const { encrypted, iv } = this.encryptData(dataToWrite); metadata = { encrypted: true, iv }; dataToWrite = encrypted; } const finalData = { metadata, data: dataToWrite }; await fs.writeFile(validatedPath, JSON.stringify(finalData, null, 2)); return { content: [{ type: "text", text: `Checkpoint saved to ${validatedPath}${params.encrypt ? ' (encrypted)' : ''}` }] }; } catch (error) { // ... error handling ... } } ); ``` #### Step 9.2: API Authentication **File:** `src/index.ts` **Add authentication middleware:** ```typescript private readonly API_KEYS = new Set( (process.env.TITAN_API_KEYS || '').split(',').filter(k => k.length > 0) ); private validateApiKey(key?: string): boolean { if (this.API_KEYS.size === 0) { // No keys configured = no auth required (development mode) return true; } return key ? this.API_KEYS.has(key) : false; } // Wrap tool handler with auth: private registerToolWithAuth( name: string, description: string, schema: any, handler: Function ): void { this.server.tool(name, description, { ...schema, apiKey: z.string().optional().describe("API authentication key") }, async (params) => { // Check authentication if (!this.validateApiKey(params.apiKey)) { return { content: [{ type: "error", text: "Authentication required. Provide valid API key." }] }; } // Call original handler return await handler(params); }); } ``` #### Step 9.3: Rate Limiting **File:** `src/index.ts` **Add rate limiting:** ```typescript private rateLimiter = new Map<string, { count: number; resetTime: number }>(); private readonly RATE_LIMIT = parseInt(process.env.TITAN_RATE_LIMIT || '100'); // requests per minute private readonly RATE_WINDOW = 60000; // 1 minute private checkRateLimit(identifier: string): { allowed: boolean; remaining: number } { const now = Date.now(); const record = this.rateLimiter.get(identifier); if (!record || now > record.resetTime) { // New window this.rateLimiter.set(identifier, { count: 1, resetTime: now + this.RATE_WINDOW }); return { allowed: true, remaining: this.RATE_LIMIT - 1 }; } if (record.count >= this.RATE_LIMIT) { return { allowed: false, remaining: 0 }; } record.count++; return { allowed: true, remaining: this.RATE_LIMIT - record.count }; } // Add to tool wrapper: private async handleToolCall(toolName: string, params: any): Promise<any> { const identifier = params.apiKey || 'anonymous'; const rateCheck = this.checkRateLimit(identifier); if (!rateCheck.allowed) { return { content: [{ type: "error", text: "Rate limit exceeded. Try again later." }] }; } // Continue with tool execution... } ``` --- ## Implementation Tracking Use `IMPLEMENTATION_PACKAGE.md` for the unified checklist and navigation matrix. `SYSTEM_AUDIT.md` contains the authoritative status tracker, while `IMPLEMENTATION_PROGRESS.md` should be updated after each milestone. For testing, pair the per-section strategies in this guide with: 1. **Unit coverage** under `test/` (add new files as needed). 2. **Integration harness** (planned) that exercises MCP stdio workflows end-to-end. 3. **Manual MCP validation** using Cursor/Claude until automated coverage lands. --- ## Reference Quick Links - **Research Paper**: `research_paper_source.md` - **System Audit**: `SYSTEM_AUDIT.md` - **Implementation Package**: `IMPLEMENTATION_PACKAGE.md` - **Current Progress Log**: `IMPLEMENTATION_PROGRESS.md` --- **End of Detailed Implementation Guide** This guide provides the context necessary to implement the remaining Titans research features and production-hardening tasks. Each section captures: - Research paper references with line numbers - Current code locations - Implementation steps - Testing ideas and integration considerations Follow the priority ordering defined in `IMPLEMENTATION_PACKAGE.md`, and update both the audit and progress tracker as you complete work.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/henryhawke/mcp-titan'

If you have feedback or need assistance with the MCP directory API, please join our Discord server