Skip to main content
Glama
wiki-documentation-plan.md12.6 kB
# Wiki Documentation Plan - Memory Quality System Era (v8.45+) ## Overview This document outlines the comprehensive wiki documentation needed to cover the major improvements introduced between v8.45.0 and v8.48.0, representing the **Memory Quality System Era** - a transformative upgrade to MCP Memory Service. ## Timeline Summary - **v8.45.0** (Dec 5, 2025): Memory Quality System Launch - **v8.45.1-v8.45.3** (Dec 5-6): Quality System stabilization and infrastructure - **v8.46.0-v8.46.3** (Dec 6): Quality + Hooks integration - **v8.47.0-v8.47.1** (Dec 6-7): Association-based quality boost and ONNX improvements - **v8.48.0** (Dec 7): CSV-based metadata compression **Total**: 11 releases in 3 days representing ~40 major features/fixes ## Proposed Wiki Structure ### 1. **Memory Quality System Guide** (New Page) **URL**: `wiki/Memory-Quality-System-Guide` **Sections**: - Overview and Philosophy (Memento-inspired design) - Architecture - Multi-tier provider system (Local SLM → Groq → Gemini → Implicit) - Local-first design with ONNX (ms-marco-MiniLM-L-6-v2) - Zero-cost, privacy-preserving evaluation - Features - Automatic quality scoring (0.0-1.0) - Quality-based forgetting and retention tiers - Quality-weighted decay - Quality-boosted search (opt-in) - Association-based quality boost (v8.47.0) - Configuration - Environment variables (10 config options) - Provider selection (local/groq/gemini/auto) - Device selection (CPU/CUDA/MPS/DirectML/ROCm) - Quality boost settings - Platform Support - Windows (CUDA, DirectML) - macOS (MPS) - Linux (CUDA, ROCm) - Performance Benchmarks - Latency: 50-100ms CPU, 10-20ms GPU - Accuracy metrics - Cost comparison ($0 local vs cloud APIs) - Usage Examples - MCP tools (rate_memory, get_memory_quality, analyze_quality_distribution, retrieve_with_quality_boost) - HTTP API endpoints (4 new REST endpoints) - Dashboard UI (quality badges, analytics view) - Troubleshooting - ONNX model loading issues - Quality score persistence (v8.46.3 fix) - Windows-specific considerations ### 2. **Quality + Hooks Integration Guide** (New Page) **URL**: `wiki/Quality-Hooks-Integration` **Sections**: - Integration Overview (3-phase approach from v8.46.0) - Phase 1: Hooks read backendQuality from metadata (20% scoring weight) - Phase 2: Session-end hook triggers async quality evaluation - Phase 3: Quality-boosted search integration - Hook Scoring Weights - timeDecay: 20% - tagRelevance: 30% - contentRelevance: 10% - contentQuality: 20% - backendQuality: 20% - Quality Evaluation Endpoint - POST `/api/quality/memories/{hash}/evaluate` - Performance: ~355ms with ONNX ranker - Non-blocking with 10s timeout - Implementation Details - calculateBackendQuality() in memory-scorer.js - triggerQualityEvaluation() in session-end.js - queryMemories() qualityBoost option - Cross-Platform Considerations - Windows session-start hook crash fix (v8.46.2) - Windows installer encoding fix (v8.46.1) ### 3. **Metadata Compression System** (New Page) **URL**: `wiki/Metadata-Compression-System` **Sections**: - Problem Statement - Cloudflare D1 10KB metadata limit - Quality/consolidation metadata size explosion - Sync failure impact (400 Bad Request errors) - Solution: CSV-Based Compression (Phase 1) - Architecture (metadata_codec.py) - CSV encoding/decoding implementation - Provider code mapping (onnx_local → ox, etc.) - 78% size reduction (732B → 159B typical) - Metadata Validation - Pre-sync size checks (<9.5KB threshold) - Prevents API failures before they occur - Implementation in hybrid.py (lines 547-559) - Transparent Operation - Automatic compression on write - Automatic decompression on read - Zero user-facing impact - Backward compatibility guaranteed - Quality Metadata Optimizations - ai_scores history limited (10 → 3 most recent) - quality_components removed from sync (debug-only) - Cloudflare-specific field suppression - Performance Impact - <1ms overhead per operation - 100% sync success rate (0 failures) - 3,750 ONNX-scored memories verified - Future Roadmap (3-phase plan) - Phase 1: CSV compression (COMPLETE) - Phase 2: Binary encoding with struct/msgpack (85-90% target) - Phase 3: Reference-based deduplication - Verification - verify_compression.sh script - Round-trip accuracy testing - Compression ratio measurement ### 4. **ONNX Quality Evaluation Deep Dive** (New Page) **URL**: `wiki/ONNX-Quality-Evaluation` **Sections**: - ONNX Ranker Overview - Model: ms-marco-MiniLM-L-6-v2 cross-encoder - Size: 23MB - Performance: 7-16ms per memory (CPU) - Critical Understanding: Cross-Encoder Design - Scores query-memory relevance, NOT absolute quality - Requires meaningful query-memory pairs - Self-matching queries produce artificially high scores - ONNX Self-Match Bug (v8.47.1) - Problem: Using memory content as its own query - Result: Artificially inflated scores (~1.0 for all) - Fix: Generate queries from tags/metadata (what memory is *about*) - Realistic distribution: avg 0.468 (42.9% high, 3.2% medium, 53.9% low) - Association Pollution (v8.47.1) - Problem: System-generated associations/clusters scored - Fix: Filter type='association' and type='compressed_cluster' - Impact: 948 system-generated memories excluded - Model Export and Loading (v8.45.3) - Dynamic export from transformers to ONNX on first use - Offline mode support (local_files_only=True) - Air-gapped environment compatibility - Export location: ~/.cache/mcp_memory/onnx_models/ - Bulk Evaluation - scripts/quality/bulk_evaluate_onnx.py - Association filtering - Sync monitoring with queue size reporting - Progress reporting and sync completion waiting - Reset and Recovery - scripts/quality/reset_onnx_scores.py - Reset all scores to implicit defaults (0.5) - Pauses sync during reset - Use case: Recover from bad evaluation ### 5. **Hybrid Backend Enhancements** (Update Existing Page) **URL**: `wiki/Hybrid-Backend-Guide` (update) **New Sections to Add**: - Sync Queue Overflow Handling (v8.47.1) - Problem: Queue capacity (1,000) overwhelmed by bulk operations (4,478 updates) - Result: 278 Cloudflare sync failures (27.8% rate) - Fix: Queue size increased to 2,000 (`MCP_HYBRID_QUEUE_SIZE`) - Fix: Batch size increased to 100 (`MCP_HYBRID_BATCH_SIZE`) - 5-second timeout with fallback to immediate sync on queue full - wait_for_sync_completion() method for monitoring - Result: 0% sync failure rate - Metadata Normalization for Cloudflare (v8.46.3) - _normalize_metadata_for_cloudflare() helper function - Separates top-level keys from custom metadata fields - Wraps custom fields in 'metadata' key - Idempotent operation - Quality Score Persistence (v8.46.3) - Fixed scores not persisting to Cloudflare - Metadata structure expectations (wrapped vs top-level) - Quality API metadata handling - SyncOperation preserve_timestamps flag - Sync Pause/Resume Enhancements (v8.47.1) - _sync_paused flag prevents enqueuing during pause - Fixed race condition where operations enqueued while paused - Ensures operations not lost during consolidation/bulk updates - CSV Metadata Compression Integration (v8.48.0) - compress_metadata_for_sync() in sync pipeline - decompress_metadata_from_sync() on retrieval - Size validation before Cloudflare API calls - 4 memory construction points in cloudflare.py ### 6. **Consolidation System Updates** (Update Existing Page) **URL**: `wiki/Memory-Consolidation-System-Guide` (update) **New Sections to Add**: - Association-Based Quality Boost (v8.47.0) - Well-connected memories (≥5 connections) get 20% quality boost - Network effect: frequently referenced = likely more valuable - Configuration: - MCP_CONSOLIDATION_QUALITY_BOOST_ENABLED (default: true) - MCP_CONSOLIDATION_MIN_CONNECTIONS_FOR_BOOST (default: 5, range: 1-100) - MCP_CONSOLIDATION_QUALITY_BOOST_FACTOR (default: 1.2, range: 1.0-2.0) - Quality scores capped at 1.0 - Full metadata audit trail: - quality_boost_applied (boolean) - quality_boost_date (ISO timestamp) - quality_boost_reason ("association_connections") - quality_boost_connection_count - original_quality_before_boost - Impact: ~4% relevance increase, potential retention tier promotion - 5 comprehensive test cases (100% pass rate) - Batch Update Optimization (v8.47.1) - Problem: Sequential update_memory() calls during consolidation - Fix: Collect updates and use single update_memories_batch() transaction - Impact: 50-100x speedup for relevance score updates - Location: consolidation/consolidator.py - Quality-Weighted Decay - High-quality memories decay 3x slower than low-quality - Integration with exponential decay calculation - Quality boost applied before quality multiplier calculation ### 7. **Dashboard UI Enhancements** (New Page) **URL**: `wiki/Dashboard-UI-Guide` **Sections**: - Quality System UI Features (v8.45.0) - Quality badges on memory cards (color-coded: green/yellow/red/gray) - Analytics view with distribution charts - Bar chart for quality distribution (high/medium/low counts) - Pie chart for provider breakdown (local/groq/gemini/implicit) - Top/bottom performers lists - Settings panel for quality configuration - i18n support (English + Chinese) - Dark Mode Improvements (v8.45.2) - Form controls and select elements dark mode fix - Global .form-control and .form-select overrides - Quality tab chart contrast improvements - Chart.js dark mode support (dynamic color configuration) - Quality distribution and provider chart updates - .view-btn hover states - Performance Benchmarks - 25ms page load - <100ms search operations - <2s for all UI operations - Accessibility - Mobile responsive (768px, 1024px breakpoints) - WCAG compliance considerations ## Implementation Timeline ### Week 1: Core Documentation - [ ] Create "Memory Quality System Guide" (Priority: CRITICAL) - [ ] Create "ONNX Quality Evaluation Deep Dive" (Priority: HIGH) - [ ] Create "Metadata Compression System" (Priority: HIGH) ### Week 2: Integration Documentation - [ ] Create "Quality + Hooks Integration Guide" (Priority: MEDIUM) - [ ] Update "Hybrid Backend Guide" (Priority: MEDIUM) - [ ] Update "Memory Consolidation System Guide" (Priority: MEDIUM) ### Week 3: UI and Polish - [ ] Create "Dashboard UI Guide" (Priority: LOW) - [ ] Add cross-references between all pages - [ ] Create visual diagrams (architecture, flow charts) - [ ] Add screenshots from dashboard ## Cross-Reference Strategy Each wiki page should link to related pages: - Memory Quality System Guide → ONNX Deep Dive, Quality+Hooks Integration - ONNX Deep Dive → Memory Quality System Guide, Troubleshooting - Metadata Compression → Hybrid Backend Guide - Quality+Hooks Integration → Memory Quality System Guide, Hooks Quick Reference - Hybrid Backend Guide → Metadata Compression, Consolidation Guide - Consolidation Guide → Memory Quality System Guide, Association Quality Boost ## Visual Assets Needed 1. **Architecture Diagrams**: - Multi-tier quality provider system flowchart - Metadata compression pipeline (encode → validate → sync → decode) - Quality-boosted search reranking algorithm - Association-based quality boost decision tree 2. **Screenshots**: - Dashboard quality badges (light + dark mode) - Analytics view (distribution + provider charts) - Quality settings panel - Top/bottom performers lists 3. **Comparison Tables**: - Provider comparison (local vs Groq vs Gemini) - Performance benchmarks (CPU vs GPU, different platforms) - Compression ratio comparison (Phase 1/2/3) - Quality retention tiers (high/medium/low) ## Success Metrics - **Documentation Coverage**: 100% of v8.45+ features documented - **Cross-References**: Every page linked to ≥2 related pages - **Visual Assets**: ≥1 diagram or screenshot per page - **User Feedback**: <5% of support requests related to documented features - **Community Engagement**: Wiki pages referenced in ≥10 GitHub issues/discussions ## Notes - All existing documentation files referenced in CLAUDE.md should be preserved - Wiki pages should be created in GitHub wiki, not in docs/ folder - Each wiki page should have "Last Updated" timestamp - Include version compatibility notes (e.g., "Available since v8.45.0") - Add "See Also" sections for related topics

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/doobidoo/mcp-memory-service'

If you have feedback or need assistance with the MCP directory API, please join our Discord server