folder-mcp

Overview Schema Related Servers Score Discussions

Logging-Enhancement-PRD.md•12.9 KiB

# Logging Enhancement PRD **folder-mcp Observability & Monitoring Improvement** ## Executive Summary Following comprehensive analysis using the TMOAT (The Mother Of All Tests) methodology and deep code review, this PRD outlines critical improvements to the folder-mcp logging system. The current logging architecture generates significant noise while missing essential operational context, making debugging and system monitoring unnecessarily difficult. **Primary Issues Identified:** 1. **Log Spam**: Repetitive, uninformative messages that overwhelm logs 2. **Missing Context**: Vague messages that don't provide actionable information 3. **Poor Signal-to-Noise Ratio**: Critical information buried in routine operations 4. **Inconsistent Categorization**: Wrong log levels for routine vs exceptional events ## Problem Analysis ### Critical Issue #1: Memory Monitoring Spam **Current Behavior:** ``` 2025-08-19T18:04:41.938Z WARN [ORCHESTRATOR] High memory usage detected | {"heapUsedMB":27,"heapUtilizationPercent":91,"recommendation":"Consider reducing concurrent operations or restarting daemon"} 2025-08-19T18:04:51.939Z WARN [ORCHESTRATOR] High memory usage detected | {"heapUsedMB":28,"heapUtilizationPercent":91,"recommendation":"Consider reducing concurrent operations or restarting daemon"} ``` **Problems:** - Triggers on **27-28MB** usage, which is extremely low for a Node.js daemon - **Every 10 seconds** without any context about what's normal - **91% heap utilization** might be normal Node.js behavior with small heap - Users can't distinguish between normal memory pressure and actual memory leaks - Provides generic recommendation without system context **Root Cause:** Hardcoded thresholds (85% utilization, 400MB absolute) without considering: - System total memory - Normal Node.js heap behavior - Application baseline memory usage - Memory growth trends vs absolute values ### Critical Issue #2: Uninformative Model Validation **Current Behavior:** ``` 2025-08-19T14:55:13.562Z INFO [folder-mcp] Handling model list request 2025-08-19T14:55:13.567Z INFO [folder-mcp] Handling model list request 2025-08-19T14:55:14.561Z INFO [folder-mcp] Handling model list request ``` **Problems:** - Shows **WHAT** (model list access) but not **WHY** (validation context) - No information about **WHO** requested it (client ID, validation trigger) - No **OUTCOME** (validation passed/failed, models returned) - Multiple identical messages with no differentiation - INFO level for internal validation operations **Root Cause Analysis:** Location: `src/daemon/websocket/handlers/model-handlers.ts:36` ```typescript this.logger.info('Handling model list request'); ``` This logs the entry point but provides no context about: - Request source (validation vs user request) - Request parameters - Processing outcome - Performance metrics ### Additional Logging Issues **3. Protocol Errors as Warnings** - WebSocket protocol errors logged as WARN instead of DEBUG/INFO - Missing request IDs make protocol debugging difficult **4. Missing Operational Context** - No startup performance metrics - Folder indexing progress without context - Missing transaction tracing for complex operations **5. Inconsistent Log Levels** - Routine operations at INFO that should be DEBUG - System warnings for normal conditions - Missing ERROR logs for actual failures ## Solution Architecture ### 1. Intelligent Memory Monitoring **Replace Current:** Fixed threshold warnings **With:** Adaptive, context-aware monitoring ```typescript interface MemoryProfile { baseline: number; // MB - established after 5 minutes trend: 'stable' | 'growing' | 'oscillating'; growthRate: number; // MB/hour lastGC: Date; systemMemory: number; // Total system memory } interface MemoryAlert { level: 'info' | 'warn' | 'critical'; trigger: 'absolute' | 'growth' | 'utilization' | 'leak'; context: string; recommendation: string; } ``` **New Memory Logging Strategy:** - **Baseline establishment**: Silent monitoring for 5 minutes to establish normal usage - **Growth-based alerts**: Warn on sustained growth (>10MB/hour), not snapshots - **System-aware thresholds**: Scale warnings based on available system memory - **Smart intervals**: - Normal conditions: Log every 5 minutes at DEBUG level - Growth detected: Every 30 seconds at INFO level - Critical conditions: Every 10 seconds at WARN level ### 2. Context-Rich Request Logging **Replace Current:** Generic operation logs **With:** Request lifecycle tracing ```typescript interface RequestContext { requestId: string; clientId: string; operation: string; trigger: 'user' | 'validation' | 'system'; startTime: Date; parameters?: Record<string, any>; } interface OperationOutcome { requestId: string; success: boolean; duration: number; itemsProcessed?: number; errorCode?: string; cacheHit?: boolean; } ``` **New Request Logging Pattern:** ```typescript // Instead of: "Handling model list request" logger.info('Model validation request', { requestId: 'req_1234567890', clientId: 'client-abc123', trigger: 'folder_add_validation', requestedModel: 'gpu:all-MiniLM-L6-v2' }); // Include outcome: logger.info('Model validation completed', { requestId: 'req_1234567890', success: true, duration: 15, modelAvailable: true, cacheHit: false }); ``` ### 3. Log Level Rationalization **New Level Guidelines:** | Level | Purpose | Examples | |-------|---------|----------| | **DEBUG** | Development/troubleshooting | Protocol handshakes, internal state transitions, cache hits | | **INFO** | Normal operations that users care about | Folder added, indexing completed, client connections | | **WARN** | Concerning but recoverable conditions | Model not cached, temporary file access issues, performance degradation | | **ERROR** | Operation failures requiring attention | Database connection failed, invalid configuration, indexing failures | | **FATAL** | System-level failures requiring restart | Unrecoverable errors, critical resource exhaustion | ### 4. Enhanced Operational Visibility **Add Missing Context:** **Folder Lifecycle Logging:** ```typescript logger.info('Folder indexing started', { path: '/path/to/folder', estimatedFiles: 1247, model: 'gpu:all-MiniLM-L6-v2', indexingId: 'idx_789012345' }); logger.info('Folder indexing progress', { indexingId: 'idx_789012345', progress: 65, filesProcessed: 810, filesRemaining: 437, estimatedCompletion: '2025-08-19T18:15:30Z' }); logger.info('Folder indexing completed', { indexingId: 'idx_789012345', totalFiles: 1247, duration: 145, averageProcessingTime: 0.116, databaseSize: '15.2MB' }); ``` **Performance Telemetry:** ```typescript logger.info('System performance snapshot', { activeConnections: 3, activeFolders: 5, totalDocuments: 15420, memoryBaseline: 32, memoryTrend: 'stable', avgQueryTime: 0.045 }); ``` ## Implementation Strategy ### Phase 1: Critical Fixes (Week 1) **Priority 1: Memory Monitoring Intelligence** - Replace hardcoded thresholds with adaptive monitoring - Implement baseline establishment logic - Add system memory context to warnings - Reduce logging frequency for normal conditions **Priority 2: Request Context Enhancement** - Add request ID generation to all WebSocket messages - Enhance model handler logging with full context - Include operation outcomes in logs ### Phase 2: Operational Improvements (Week 2) **Priority 3: Log Level Corrections** - Audit all existing log statements for appropriate levels - Move routine operations from INFO to DEBUG - Ensure actual errors are logged at ERROR level **Priority 4: Enhanced Visibility** - Add folder lifecycle progress logging - Implement performance telemetry snapshots - Add startup/shutdown performance metrics ### Phase 3: Advanced Features (Week 3) **Priority 5: Structured Logging Standards** - Standardize log message formats across components - Implement consistent metadata structures - Add correlation ID tracking for complex operations **Priority 6: Log Management Features** - Add log rotation for file transports - Implement log level runtime configuration - Add performance impact monitoring for logging itself ## Success Metrics ### Quantitative Goals - **Reduce log volume by 70%** while maintaining operational visibility - **Eliminate false positive warnings** (target: <1% of current memory warnings) - **Increase actionable log ratio** from ~15% to >80% - **Reduce time to diagnosis** for common issues by 60% ### Qualitative Improvements - **Clear signal-to-noise**: Users can quickly identify actual issues - **Actionable information**: Each log entry provides context for next steps - **Consistent formatting**: Predictable log patterns across all components - **Progressive detail**: More information available at higher log levels ### Monitoring & Validation **Log Quality Metrics:** - False positive rate for warnings and errors - Time-to-resolution for issues requiring log analysis - Developer satisfaction with log clarity and usefulness **Performance Impact:** - Logging overhead measurement (<5% of total CPU) - Memory usage of logging infrastructure (<2% of heap) - Storage efficiency of log files ## Technical Implementation Details ### Memory Monitoring Algorithm ```typescript class IntelligentMemoryMonitor { private baseline: number | null = null; private samples: MemorySample[] = []; private establishmentStart = Date.now(); checkMemory(): MemoryAlert | null { const current = this.getCurrentMemory(); this.samples.push(current); // Baseline establishment period (5 minutes) if (!this.baseline && Date.now() - this.establishmentStart < 5 * 60 * 1000) { return null; // Silent monitoring } // Establish baseline from stable period if (!this.baseline) { this.baseline = this.calculateBaseline(); } // Growth analysis const growthRate = this.calculateGrowthRate(); const deviation = current.heapUsed - this.baseline; // Alert logic if (growthRate > 10 && deviation > 50) { return { level: 'warn', trigger: 'growth', context: `Memory growing at ${growthRate.toFixed(1)}MB/hr, ${deviation.toFixed(0)}MB above baseline`, recommendation: 'Monitor for memory leak patterns' }; } return null; // No alert needed } } ``` ### Enhanced Request Logging ```typescript class RequestLogger { private activeRequests = new Map<string, RequestContext>(); startRequest(operation: string, clientId: string, params?: any): string { const requestId = `req_${Date.now()}_${Math.random().toString(36).substr(2, 9)}`; const context: RequestContext = { requestId, clientId, operation, trigger: this.inferTrigger(operation, params), startTime: new Date(), parameters: params }; this.activeRequests.set(requestId, context); logger.info(`${operation} started`, { requestId, clientId, trigger: context.trigger, ...params }); return requestId; } completeRequest(requestId: string, outcome: Partial<OperationOutcome>): void { const context = this.activeRequests.get(requestId); if (!context) return; const duration = Date.now() - context.startTime.getTime(); logger.info(`${context.operation} completed`, { requestId, success: outcome.success ?? true, duration, ...outcome }); this.activeRequests.delete(requestId); } } ``` ## Migration Plan ### Backward Compatibility - Maintain existing log format for external tools during transition - Add feature flag for new vs legacy logging modes - Provide migration guide for log parsing scripts ### Rollout Strategy 1. **Development Environment**: Full implementation and validation 2. **Staging Environment**: Performance impact testing 3. **Production**: Gradual rollout with monitoring 4. **Feedback Loop**: Adjust thresholds based on real-world usage ### Risk Mitigation - **Performance Impact**: Benchmark logging overhead before rollout - **Log Volume**: Monitor disk usage during transition - **Debugging Capability**: Ensure new logs provide better debugging than old system --- ## Conclusion The current logging system generates more noise than signal, making system monitoring and debugging unnecessarily difficult. This PRD provides a comprehensive solution that: 1. **Eliminates log spam** through intelligent monitoring and appropriate log levels 2. **Adds essential context** to make every log entry actionable 3. **Improves operational visibility** with structured, informative messages 4. **Maintains performance** while dramatically improving usefulness Implementation of these changes will transform folder-mcp logs from a source of confusion into a powerful operational tool that enables rapid debugging, clear system understanding, and proactive monitoring. **Expected Outcome**: Logs that tell the complete story of system behavior in a clear, concise manner that enables rapid problem resolution and confident system operation.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/okets/folder-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

Logging-Enhancement-PRD.md•12.9 KiB