Gmail MCP Server

EMAIL_CLEANUP_SYSTEM_ARCHITECTURE.md•33.5 KiB

# Gmail MCP Server - Email Cleanup System Architecture ## Executive Summary This document outlines the comprehensive architecture for an automated email cleanup system designed to integrate with the existing Gmail MCP Server. The system provides intelligent, multi-factor email cleanup with continuous background processing, adaptive learning, and event-driven automation. ## Current System Analysis ### Existing Architecture Strengths - **Sophisticated Categorization Engine**: Multi-analyzer system with [`ImportanceAnalyzer`](../src/categorization/analyzers/ImportanceAnalyzer.ts), [`DateSizeAnalyzer`](../src/categorization/analyzers/DateSizeAnalyzer.ts), and [`LabelClassifier`](../src/categorization/analyzers/LabelClassifier.ts) - **Robust Database Schema**: SQLite with comprehensive email metadata and analyzer results - **Job Queue System**: Async processing with [`JobQueue`](../src/database/JobQueue.ts) and [`JobStatusStore`](../src/database/JobStatusStore.ts) - **Existing Managers**: [`DeleteManager`](../src/delete/DeleteManager.ts) and [`ArchiveManager`](../src/archive/ArchiveManager.ts) for email operations - **Modular Design**: Factory pattern for analyzers, configurable scoring ### Architecture Gaps for Cleanup System - No access pattern tracking for frequently referenced emails - No automated cleanup policy engine - No continuous background cleanup processing - No database optimization post-cleanup - No adaptive learning and threshold adjustment --- ## Overall System Architecture ```mermaid graph TB subgraph "Enhanced Email Cleanup System" subgraph "Phase 1: Foundation" A[AccessPatternTracker] --> B[EmailAccessLogger] A --> C[SearchActivityTracker] D[CleanupPolicyEngine] --> E[StalenessScorer] D --> F[CleanupConfigManager] end subgraph "Phase 2: Core Automation" G[AutomationOrchestrator] --> H[ContinuousCleanupEngine] G --> I[ScheduledCleanupRunner] G --> J[EventDrivenTriggers] end subgraph "Phase 3: Intelligence" K[AdaptiveLearningEngine] --> L[PolicyOptimizer] K --> M[ThresholdAdjuster] K --> N[PatternAnalyzer] end subgraph "Phase 4: Advanced Features" O[IntelligentScheduler] --> P[LoadBasedScheduling] O --> Q[PerformanceAwareScheduling] R[DatabaseOptimizer] --> S[AutoVacuum] R --> T[IndexOptimization] end subgraph "Existing System Integration" E --> U[ImportanceAnalyzer] E --> V[DateSizeAnalyzer] E --> W[LabelClassifier] H --> X[DeleteManager] H --> Y[ArchiveManager] G --> Z[JobQueue] G --> AA[JobStatusStore] C --> BB[SearchEngine] B --> CC[DatabaseManager] end end ``` --- # Implementation Plan ## Phase 1: Foundation Infrastructure (Week 1-2) ### 1.1 Access Pattern Tracking System **Objective**: Track email access patterns and search activity to identify frequently referenced emails for preservation. #### Database Schema Extensions ```sql -- Email access tracking CREATE TABLE email_access_log ( id INTEGER PRIMARY KEY AUTOINCREMENT, email_id TEXT NOT NULL, access_type TEXT NOT NULL CHECK(access_type IN ('search_result', 'direct_view', 'thread_view')), timestamp INTEGER NOT NULL, search_query TEXT, user_context TEXT, FOREIGN KEY (email_id) REFERENCES email_index(id) ); -- Search activity tracking CREATE TABLE search_activity ( search_id TEXT PRIMARY KEY, query TEXT NOT NULL, email_results TEXT, -- JSON array of email IDs result_interactions TEXT, -- JSON array of clicked email IDs timestamp INTEGER NOT NULL, result_count INTEGER ); -- Access pattern summary (optimized for queries) CREATE TABLE email_access_summary ( email_id TEXT PRIMARY KEY, total_accesses INTEGER DEFAULT 0, last_accessed INTEGER, search_appearances INTEGER DEFAULT 0, search_interactions INTEGER DEFAULT 0, access_score REAL DEFAULT 0, updated_at INTEGER DEFAULT (strftime('%s', 'now')), FOREIGN KEY (email_id) REFERENCES email_index(id) ); -- Indexes for performance CREATE INDEX idx_access_log_email_id ON email_access_log(email_id); CREATE INDEX idx_access_log_timestamp ON email_access_log(timestamp); CREATE INDEX idx_search_activity_timestamp ON search_activity(timestamp); CREATE INDEX idx_access_summary_score ON email_access_summary(access_score); ``` #### Core Components **AccessPatternTracker Interface** ```typescript interface EmailAccessEvent { email_id: string; access_type: 'search_result' | 'direct_view' | 'thread_view'; timestamp: Date; search_query?: string; user_context?: string; } interface SearchActivityRecord { search_id: string; query: string; email_results: string[]; timestamp: Date; result_interactions: string[]; } interface EmailAccessSummary { email_id: string; total_accesses: number; last_accessed: Date; search_appearances: number; search_interactions: number; access_score: number; } class AccessPatternTracker { async logEmailAccess(event: EmailAccessEvent): Promise<void>; async logSearchActivity(record: SearchActivityRecord): Promise<void>; async updateAccessSummary(email_id: string): Promise<void>; async getAccessSummary(email_id: string): Promise<EmailAccessSummary | null>; async calculateAccessScore(email_id: string): Promise<number>; } ``` ### 1.2 Cleanup Policy Engine **Objective**: Create configurable policies for determining email staleness and cleanup actions. #### Policy Configuration Schema ```typescript interface CleanupPolicy { id: string; name: string; enabled: boolean; priority: number; // Staleness criteria criteria: { age_days_min: number; importance_level_max: 'high' | 'medium' | 'low'; size_threshold_min?: number; // bytes spam_score_min?: number; // 0-1 promotional_score_min?: number; // 0-1 access_score_max?: number; // 0-1 no_access_days?: number; }; // Actions to take action: { type: 'archive' | 'delete'; method?: 'gmail' | 'export'; export_format?: 'mbox' | 'json'; }; // Safety settings safety: { max_emails_per_run: number; require_confirmation: boolean; dry_run_first: boolean; preserve_important: boolean; }; // Scheduling schedule?: { frequency: 'continuous' | 'daily' | 'weekly' | 'monthly'; time?: string; // HH:MM format enabled: boolean; }; } interface StalenessScore { email_id: string; total_score: number; // 0-1, higher = more stale factors: { age_score: number; // 0-1 (higher = older) importance_score: number; // 0-1 (higher = less important) size_penalty: number; // 0-1 (higher = larger) spam_score: number; // 0-1 (higher = more spam-like) access_score: number; // 0-1 (higher = less accessed) }; recommendation: 'keep' | 'archive' | 'delete'; confidence: number; // 0-1 } ``` #### Core Components ```typescript class CleanupPolicyEngine { async createPolicy(policy: CleanupPolicy): Promise<string>; async updatePolicy(policyId: string, updates: Partial<CleanupPolicy>): Promise<void>; async deletePolicy(policyId: string): Promise<void>; async getActivePolicies(): Promise<CleanupPolicy[]>; async validatePolicy(policy: CleanupPolicy): Promise<{ valid: boolean; errors: string[] }>; } class StalenessScorer { async calculateStaleness(email: EmailIndex, accessSummary: EmailAccessSummary): Promise<StalenessScore>; private calculateAgeScore(date: Date): number; private calculateImportanceScore(email: EmailIndex): number; private calculateSizeScore(size: number): number; private calculateSpamScore(email: EmailIndex): number; private calculateAccessScore(accessSummary: EmailAccessSummary): number; } ``` #### Database Schema for Policies ```sql -- Cleanup policies storage CREATE TABLE cleanup_policies ( id TEXT PRIMARY KEY, name TEXT NOT NULL, enabled INTEGER DEFAULT 1, priority INTEGER DEFAULT 50, criteria TEXT NOT NULL, -- JSON action TEXT NOT NULL, -- JSON safety TEXT NOT NULL, -- JSON schedule TEXT, -- JSON, nullable created_at INTEGER DEFAULT (strftime('%s', 'now')), updated_at INTEGER DEFAULT (strftime('%s', 'now')) ); -- Policy execution history CREATE TABLE policy_execution_history ( execution_id TEXT PRIMARY KEY, policy_id TEXT NOT NULL, started_at INTEGER NOT NULL, completed_at INTEGER, emails_processed INTEGER DEFAULT 0, emails_cleaned INTEGER DEFAULT 0, errors_encountered INTEGER DEFAULT 0, success BOOLEAN DEFAULT 0, FOREIGN KEY (policy_id) REFERENCES cleanup_policies(id) ); ``` --- ## Phase 2: Core Automation Engine (Week 3-4) ### 2.1 Automation Orchestrator **Objective**: Central coordinator for all automated cleanup operations with intelligent scheduling and execution. #### Core Architecture ```typescript interface AutomationConfig { continuous_cleanup: { enabled: boolean; target_emails_per_minute: number; max_concurrent_operations: number; pause_during_peak_hours: boolean; peak_hours: { start: string; end: string }; // HH:MM format }; event_triggers: { storage_threshold: { enabled: boolean; warning_threshold_percent: number; // 80% critical_threshold_percent: number; // 95% emergency_policies: string[]; // policy IDs }; performance_threshold: { enabled: boolean; query_time_threshold_ms: number; cache_hit_rate_threshold: number; }; email_volume_threshold: { enabled: boolean; daily_email_threshold: number; immediate_cleanup_policies: string[]; }; }; } class AutomationOrchestrator { private continuousCleanupEngine: ContinuousCleanupEngine; private scheduledCleanupRunner: ScheduledCleanupRunner; private eventDrivenTriggers: EventDrivenTriggers; async initialize(config: AutomationConfig): Promise<void>; async startAutomation(): Promise<void>; async stopAutomation(): Promise<void>; async pauseAutomation(durationMs: number): Promise<void>; async getAutomationStatus(): Promise<AutomationStatus>; } ``` ### 2.2 Continuous Cleanup Engine **Objective**: Background service that continuously processes emails for cleanup in small batches. ```typescript class ContinuousCleanupEngine { private isRunning: boolean = false; private rateLimiter: RateLimiter; private performanceMonitor: PerformanceMonitor; async startContinuousCleanup(): Promise<void> { this.isRunning = true; while (this.isRunning) { try { // Check system health and adjust operation rate const systemHealth = await this.performanceMonitor.getSystemHealth(); this.rateLimiter.adjustRate(systemHealth); // Skip if during peak hours if (this.isDuringPeakHours()) { await this.sleep(60000); // Wait 1 minute continue; } // Get next batch of emails to process const emailBatch = await this.getNextCleanupBatch(); if (emailBatch.length > 0) { await this.processCleanupBatch(emailBatch); } // Adaptive sleep based on system load const sleepDuration = this.calculateSleepDuration(systemHealth); await this.sleep(sleepDuration); } catch (error) { logger.error('Continuous cleanup error:', error); await this.sleep(5000); // Back off on errors } } } private async getNextCleanupBatch(): Promise<EmailIndex[]> { // Get emails ordered by staleness score const query = { limit: this.config.batch_size, orderBy: 'staleness_score', orderDirection: 'DESC', exclude_recently_accessed: true }; return this.databaseManager.searchEmails(query); } private async processCleanupBatch(emails: EmailIndex[]): Promise<void> { const activePolicies = await this.policyEngine.getActivePolicies(); for (const email of emails) { const accessSummary = await this.accessTracker.getAccessSummary(email.id); const stalenessScore = await this.stalenessScorer.calculateStaleness(email, accessSummary); // Find applicable policy const applicablePolicy = this.findApplicablePolicy(stalenessScore, activePolicies); if (applicablePolicy && this.shouldExecuteCleanup(stalenessScore, applicablePolicy)) { await this.executeCleanup(email, applicablePolicy); } } } } ``` ### 2.3 Event-Driven Triggers **Objective**: Respond automatically to system events like storage pressure or performance degradation. ```typescript class EventDrivenTriggers { private storageMonitor: StorageMonitor; private performanceMonitor: PerformanceMonitor; private eventHandlers: Map<string, EventHandler>; async setupTriggers(config: AutomationConfig['event_triggers']): Promise<void> { // Storage threshold monitoring if (config.storage_threshold.enabled) { this.storageMonitor.onThresholdExceeded(async (usage) => { await this.handleStorageThreshold(usage, config.storage_threshold); }); } // Performance threshold monitoring if (config.performance_threshold.enabled) { this.performanceMonitor.onPerformanceDegradation(async (metrics) => { await this.handlePerformanceThreshold(metrics, config.performance_threshold); }); } // Email volume monitoring if (config.email_volume_threshold.enabled) { this.emailVolumeMonitor.onVolumeThreshold(async (count) => { await this.handleEmailVolumeThreshold(count, config.email_volume_threshold); }); } } private async handleStorageThreshold( usage: number, config: AutomationConfig['event_triggers']['storage_threshold'] ): Promise<void> { if (usage >= config.critical_threshold_percent) { logger.warn(`Critical storage usage: ${usage}% - executing emergency cleanup`); await this.executeEmergencyCleanup(config.emergency_policies); } else if (usage >= config.warning_threshold_percent) { logger.info(`Storage warning: ${usage}% - accelerating cleanup`); await this.accelerateCleanup(2.0); // Double the cleanup rate } } } ``` ### 2.4 Enhanced Job Management **Objective**: Extend existing job system with cleanup-specific job types and management. ```typescript interface CleanupJob extends Job { job_type: 'continuous_cleanup' | 'scheduled_cleanup' | 'event_cleanup' | 'emergency_cleanup'; cleanup_metadata: { policy_id?: string; triggered_by: 'schedule' | 'storage_threshold' | 'performance' | 'user_request' | 'continuous'; priority: 'low' | 'normal' | 'high' | 'emergency'; batch_size: number; target_emails: number; }; progress_details: { emails_analyzed: number; emails_cleaned: number; storage_freed: number; errors_encountered: number; current_batch: number; total_batches: number; }; } class CleanupJobManager extends JobStatusStore { async createCleanupJob( type: CleanupJob['job_type'], metadata: CleanupJob['cleanup_metadata'] ): Promise<string> { const jobId = await this.createJob(type, { cleanup_metadata: metadata, progress_details: { emails_analyzed: 0, emails_cleaned: 0, storage_freed: 0, errors_encountered: 0, current_batch: 0, total_batches: 0 } }); return jobId; } async updateCleanupProgress( jobId: string, progress: Partial<CleanupJob['progress_details']> ): Promise<void> { await this.updateJobStatus(jobId, JobStatus.IN_PROGRESS, { progress_details: progress }); } } ``` #### Database Schema Extensions ```sql -- Enhanced job tracking for cleanup operations ALTER TABLE job_statuses ADD COLUMN cleanup_metadata TEXT; -- JSON ALTER TABLE job_statuses ADD COLUMN progress_details TEXT; -- JSON -- System monitoring metrics CREATE TABLE system_metrics ( metric_id INTEGER PRIMARY KEY AUTOINCREMENT, timestamp INTEGER NOT NULL, storage_usage_percent REAL, storage_used_bytes INTEGER, storage_total_bytes INTEGER, average_query_time_ms REAL, cache_hit_rate REAL, active_connections INTEGER, cleanup_rate_per_minute REAL, system_load_average REAL ); CREATE INDEX idx_metrics_timestamp ON system_metrics(timestamp); ``` --- ## Phase 3: Adaptive Intelligence (Week 5-6) ### 3.1 Adaptive Learning Engine **Objective**: Learn from cleanup history and user behavior to optimize policies automatically. ```typescript interface CleanupPatterns { most_effective_age_thresholds: { [category: string]: number }; storage_impact_by_category: { [category: string]: number }; performance_improvements: { [policy_id: string]: number }; user_complaint_patterns: { [policy_id: string]: number }; optimal_batch_sizes: { [time_of_day: string]: number }; seasonal_patterns: { [month: string]: CleanupEffectiveness }; } interface CleanupEffectiveness { policy_id: string; emails_processed: number; storage_freed: number; performance_gain: number; user_satisfaction: number; // 0-1 false_positive_rate: number; // 0-1 } class AdaptiveLearningEngine { async analyzeCleanupHistory(days: number = 30): Promise<CleanupPatterns> { const history = await this.getCleanupHistory(days); return { most_effective_age_thresholds: this.analyzeAgeEffectiveness(history), storage_impact_by_category: this.analyzeStorageImpact(history), performance_improvements: this.analyzePerformanceGains(history), user_complaint_patterns: this.analyzeUserComplaints(history), optimal_batch_sizes: this.analyzeOptimalBatchSizes(history), seasonal_patterns: this.analyzeSeasonalPatterns(history) }; } async optimizePolicies(): Promise<PolicyOptimization[]> { const patterns = await this.analyzeCleanupHistory(); const optimizations: PolicyOptimization[] = []; const activePolicies = await this.policyEngine.getActivePolicies(); for (const policy of activePolicies) { const optimization = await this.calculatePolicyOptimization(policy, patterns); if (optimization.confidence > 0.8) { optimizations.push(optimization); await this.applyPolicyOptimization(policy.id, optimization); } } return optimizations; } private async calculatePolicyOptimization( policy: CleanupPolicy, patterns: CleanupPatterns ): Promise<PolicyOptimization> { // Analyze policy effectiveness const effectiveness = await this.calculatePolicyEffectiveness(policy.id); // Calculate optimal thresholds based on patterns const optimalAgeThreshold = patterns.most_effective_age_thresholds[policy.criteria.importance_level_max]; const optimalBatchSize = this.calculateOptimalBatchSize(patterns.optimal_batch_sizes); return { policy_id: policy.id, suggested_changes: { age_days_min: optimalAgeThreshold, batch_size: optimalBatchSize, frequency_adjustment: this.calculateFrequencyAdjustment(effectiveness) }, confidence: this.calculateOptimizationConfidence(effectiveness, patterns), expected_improvement: this.calculateExpectedImprovement(policy, patterns) }; } } ``` ### 3.2 Dynamic Threshold Adjustment **Objective**: Automatically adjust cleanup thresholds based on system performance and user behavior. ```typescript interface AdaptiveThreshold { threshold_type: string; current_value: number; original_value: number; adjustment_history: ThresholdAdjustment[]; last_optimized: Date; optimization_confidence: number; } interface ThresholdAdjustment { timestamp: Date; old_value: number; new_value: number; reason: string; performance_impact: number; } class DynamicThresholdAdjuster { async adjustThresholds(): Promise<void> { const systemMetrics = await this.getRecentSystemMetrics(); const cleanupEffectiveness = await this.getCleanupEffectiveness(); // Adjust age thresholds based on storage pressure if (systemMetrics.storage_usage_percent > 85) { await this.adjustAgeThresholds('decrease', 0.8); // More aggressive } else if (systemMetrics.storage_usage_percent < 70) { await this.adjustAgeThresholds('increase', 1.2); // Less aggressive } // Adjust importance thresholds based on user complaints const complaintRate = await this.getRecentComplaintRate(); if (complaintRate > 0.05) { // 5% complaint rate await this.adjustImportanceThresholds('more_conservative'); } // Adjust batch sizes based on performance if (systemMetrics.average_query_time_ms > 1000) { await this.adjustBatchSizes('decrease'); } } private async adjustAgeThresholds(direction: 'increase' | 'decrease', factor: number): Promise<void> { const thresholds = await this.getAdaptiveThresholds('age_thresholds'); for (const threshold of thresholds) { const newValue = direction === 'increase' ? threshold.current_value * factor : threshold.current_value / factor; await this.updateThreshold(threshold.threshold_type, newValue, `Auto-adjusted due to storage pressure`); } } } ``` ### 3.3 Pattern Analysis and Prediction **Objective**: Identify patterns in email behavior and predict optimal cleanup times. ```typescript class PatternAnalyzer { async identifyUserBehaviorPatterns(): Promise<UserBehaviorPatterns> { const accessData = await this.getAccessPatternData(90); // 90 days return { peak_activity_hours: this.identifyPeakHours(accessData), email_reading_patterns: this.analyzeReadingPatterns(accessData), search_behavior: this.analyzeSearchBehavior(accessData), cleanup_reaction_patterns: this.analyzeCleanupReactions(accessData) }; } async predictOptimalCleanupTimes(): Promise<OptimalCleanupSchedule> { const patterns = await this.identifyUserBehaviorPatterns(); const systemLoad = await this.getSystemLoadPatterns(); return { daily_optimal_windows: this.calculateOptimalWindows(patterns, systemLoad), weekly_patterns: this.identifyWeeklyPatterns(patterns), seasonal_adjustments: this.calculateSeasonalAdjustments(patterns) }; } async identifyEmailImportancePatterns(): Promise<ImportancePatterns> { const emails = await this.getEmailsWithAccessData(); return { domain_importance_mapping: this.analyzeDomainImportance(emails), keyword_importance_signals: this.analyzeKeywordImportance(emails), thread_participation_importance: this.analyzeThreadImportance(emails), time_based_importance_decay: this.analyzeImportanceDecay(emails) }; } } ``` #### Database Schema for Learning ```sql -- Adaptive thresholds that evolve over time CREATE TABLE adaptive_thresholds ( threshold_type TEXT PRIMARY KEY, current_value REAL NOT NULL, original_value REAL NOT NULL, adjustment_history TEXT, -- JSON array of adjustments last_optimized INTEGER, optimization_confidence REAL, updated_at INTEGER DEFAULT (strftime('%s', 'now')) ); -- Pattern analysis results CREATE TABLE user_behavior_patterns ( pattern_id TEXT PRIMARY KEY, pattern_type TEXT NOT NULL, -- 'peak_hours', 'reading_patterns', etc. pattern_data TEXT NOT NULL, -- JSON confidence_score REAL, last_updated INTEGER DEFAULT (strftime('%s', 'now')), next_analysis INTEGER ); -- Policy optimization history CREATE TABLE policy_optimization_history ( optimization_id TEXT PRIMARY KEY, policy_id TEXT NOT NULL, optimization_timestamp INTEGER NOT NULL, changes_made TEXT NOT NULL, -- JSON predicted_improvement REAL, actual_improvement REAL, confidence_score REAL, FOREIGN KEY (policy_id) REFERENCES cleanup_policies(id) ); ``` --- ## Phase 4: Advanced Automation Features (Week 7-8) ### 4.1 Intelligent Load-Based Scheduling **Objective**: Schedule cleanup operations based on system load and user activity patterns. ```typescript interface LoadBasedScheduling { load_thresholds: { cpu_threshold: number; memory_threshold: number; io_threshold: number; user_activity_threshold: number; }; scheduling_rules: { pause_during_high_load: boolean; scale_operations_with_load: boolean; prefer_low_activity_periods: boolean; adaptive_batch_sizing: boolean; }; } class IntelligentScheduler { async scheduleOptimalCleanup(): Promise<ScheduleOptimization> { const systemLoad = await this.getCurrentSystemLoad(); const userActivity = await this.getCurrentUserActivity(); const predictedOptimalTimes = await this.patternAnalyzer.predictOptimalCleanupTimes(); return { immediate_operations: this.calculateImmediateOperations(systemLoad, userActivity), scheduled_operations: this.optimizeScheduledOperations(predictedOptimalTimes), adaptive_adjustments: this.calculateAdaptiveAdjustments(systemLoad) }; } async adjustScheduleBasedOnLoad(): Promise<void> { const currentLoad = await this.getCurrentSystemLoad(); if (currentLoad.cpu_usage > 80 || currentLoad.memory_usage > 85) { // Pause or reduce cleanup operations await this.orchestrator.pauseAutomation(300000); // 5 minutes logger.info('Paused cleanup due to high system load'); } else if (currentLoad.cpu_usage < 30 && currentLoad.memory_usage < 50) { // Accelerate cleanup operations await this.orchestrator.accelerateCleanup(1.5); logger.info('Accelerated cleanup due to low system load'); } } } ``` ### 4.2 Performance-Aware Database Optimization **Objective**: Optimize database performance automatically after cleanup operations. ```typescript class DatabaseOptimizer { async optimizePostCleanup(cleanupResults: CleanupResults): Promise<OptimizationResults> { const results: OptimizationResults = { vacuum_performed: false, indexes_rebuilt: false, storage_freed: 0, performance_improvement: 0, optimization_time_ms: 0 }; const startTime = Date.now(); try { // 1. Analyze if optimization is needed const optimizationNeeded = await this.assessOptimizationNeed(cleanupResults); if (optimizationNeeded.vacuum_needed) { await this.performVacuum(); results.vacuum_performed = true; } if (optimizationNeeded.reindex_needed) { await this.rebuildIndexes(); results.indexes_rebuilt = true; } // 2. Update access summaries in batch await this.updateAccessSummariesBatch(); // 3. Analyze statistics await this.updateTableStatistics(); // 4. Calculate optimization impact results.storage_freed = await this.calculateStorageFreed(); results.performance_improvement = await this.measurePerformanceImprovement(); } catch (error) { logger.error('Database optimization failed:', error); throw error; } finally { results.optimization_time_ms = Date.now() - startTime; } return results; } private async assessOptimizationNeed(cleanupResults: CleanupResults): Promise<OptimizationAssessment> { const deletedCount = cleanupResults.emails_deleted; const totalEmails = await this.getTotalEmailCount(); const deletionPercentage = deletedCount / totalEmails; return { vacuum_needed: deletionPercentage > 0.1, // 10% of emails deleted reindex_needed: deletedCount > 10000, // More than 10k emails deleted stats_update_needed: deletionPercentage > 0.05 // 5% of emails deleted }; } private async performVacuum(): Promise<void> { logger.info('Starting database VACUUM operation'); await this.databaseManager.execute('VACUUM'); logger.info('Database VACUUM completed'); } private async rebuildIndexes(): Promise<void> { const indexes = [ 'idx_email_category', 'idx_email_date', 'idx_email_importance_score', 'idx_access_summary_score' ]; for (const index of indexes) { await this.databaseManager.execute(`REINDEX ${index}`); } } } ``` ### 4.3 Advanced Monitoring and Alerting **Objective**: Comprehensive monitoring of cleanup operations with intelligent alerting. ```typescript interface MonitoringConfig { alerts: { cleanup_failure_threshold: number; performance_degradation_threshold: number; storage_critical_threshold: number; false_positive_rate_threshold: number; }; reporting: { daily_summary: boolean; weekly_analysis: boolean; monthly_optimization_report: boolean; real_time_metrics: boolean; }; } class CleanupMonitor { async startMonitoring(config: MonitoringConfig): Promise<void> { // Real-time performance monitoring setInterval(async () => { await this.checkSystemHealth(); }, 30000); // Every 30 seconds // Daily summary reports if (config.reporting.daily_summary) { this.scheduleDaily(async () => { await this.generateDailySummary(); }); } // Weekly analysis if (config.reporting.weekly_analysis) { this.scheduleWeekly(async () => { await this.generateWeeklyAnalysis(); }); } } async checkSystemHealth(): Promise<void> { const metrics = await this.getCurrentMetrics(); // Check for performance degradation if (metrics.average_query_time_ms > this.config.alerts.performance_degradation_threshold) { await this.sendAlert('performance_degradation', { current_time: metrics.average_query_time_ms, threshold: this.config.alerts.performance_degradation_threshold }); } // Check storage levels if (metrics.storage_usage_percent > this.config.alerts.storage_critical_threshold) { await this.sendAlert('storage_critical', { current_usage: metrics.storage_usage_percent, threshold: this.config.alerts.storage_critical_threshold }); } // Check cleanup effectiveness const falsePositiveRate = await this.calculateRecentFalsePositiveRate(); if (falsePositiveRate > this.config.alerts.false_positive_rate_threshold) { await this.sendAlert('high_false_positive_rate', { rate: falsePositiveRate, threshold: this.config.alerts.false_positive_rate_threshold }); } } async generateDailySummary(): Promise<DailySummaryReport> { const yesterday = new Date(Date.now() - 24 * 60 * 60 * 1000); return { date: yesterday, emails_processed: await this.getEmailsProcessedCount(yesterday), emails_cleaned: await this.getEmailsCleanedCount(yesterday), storage_freed: await this.getStorageFreed(yesterday), performance_impact: await this.getPerformanceImpact(yesterday), policy_effectiveness: await this.getPolicyEffectiveness(yesterday), user_complaints: await this.getUserComplaints(yesterday), system_health_score: await this.calculateSystemHealthScore(yesterday) }; } } ``` --- ## Integration with Existing System ### Integration Points 1. **Database Manager Integration** - Extend [`DatabaseManager`](../src/database/DatabaseManager.ts) with access tracking methods - Add cleanup-specific queries and optimizations - Integrate with existing email index schema 2. **Job System Integration** - Extend [`JobQueue`](../src/database/JobQueue.ts) and [`JobStatusStore`](../src/database/JobStatusStore.ts) - Add cleanup-specific job types and metadata - Integrate with existing job processing 3. **Analyzer Integration** - Utilize existing [`ImportanceAnalyzer`](../src/categorization/analyzers/ImportanceAnalyzer.ts), [`DateSizeAnalyzer`](../src/categorization/analyzers/DateSizeAnalyzer.ts), and [`LabelClassifier`](../src/categorization/analyzers/LabelClassifier.ts) - Extend scoring algorithms for staleness calculation - Reuse categorization results 4. **Delete/Archive Manager Integration** - Enhance [`DeleteManager`](../src/delete/DeleteManager.ts) with automated cleanup capabilities - Integrate with [`ArchiveManager`](../src/archive/ArchiveManager.ts) for intelligent archiving - Add safety mechanisms and rollback capabilities ### Configuration Schema ```typescript interface EmailCleanupSystemConfig { automation: AutomationConfig; policies: CleanupPolicy[]; monitoring: MonitoringConfig; optimization: { database_optimization: boolean; performance_monitoring: boolean; adaptive_learning: boolean; }; safety: { max_emails_per_day: number; confirmation_thresholds: { high_importance: number; bulk_operations: number; }; rollback_capability: boolean; user_notification: boolean; }; } ``` --- ## Deployment and Testing Strategy ### Phase 1 Testing - Unit tests for access pattern tracking - Integration tests with existing search system - Policy engine validation tests ### Phase 2 Testing - Continuous cleanup engine load testing - Event trigger simulation tests - Job system integration tests ### Phase 3 Testing - Learning algorithm validation - Pattern analysis accuracy tests - Threshold adjustment effectiveness tests ### Phase 4 Testing - Full system integration tests - Performance impact assessment - User acceptance testing with dry-run mode ### Production Deployment 1. Deploy with dry-run mode enabled 2. Monitor for 1 week with real data 3. Gradually enable automated cleanup 4. Full automation after validation --- ## Success Metrics - **Storage Optimization**: 30-50% reduction in storage usage - **Performance Improvement**: 25% faster query times - **Automation Effectiveness**: 95% of cleanups require no user intervention - **User Satisfaction**: <2% false positive rate for important emails - **System Reliability**: 99.9% uptime for cleanup services This architecture provides a comprehensive, intelligent email cleanup system that learns and adapts while maintaining safety and performance.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kushal45/GmailMcpServer'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

EMAIL_CLEANUP_SYSTEM_ARCHITECTURE.md•33.5 KiB