Skip to main content
Glama
spec.md15.5 kB
# Feature Specification: Multi-Agent Memory Consolidation **Feature Branch**: `003-multi-agent-consolidation` **Created**: 2025-11-24 **Status**: Clarified **Input**: User description: "Multi-agent memory consolidation using beads for coordination. Five specialized agents: Decay Analyzer (identifies memories approaching forget threshold), Cluster Detector (finds similar memories for potential merge), Semantic Merge (combines related memories intelligently), LTM Promoter (moves high-value memories to long-term storage), and Relationship Discovery (finds implicit connections between memories). Beads serves as message queue and audit log." ## Clarifications ### Session 2025-11-24 - Q: How should the consolidation agents be implemented? → A: Python classes (single-process, called via CLI, simple to test) - Q: What triggers agent execution? → A: Hybrid (scheduled scans + event triggers for urgent items with score < 0.10) - Q: How should new agents relate to existing MCP tools? → A: Extend (agents call existing tool logic internally, add orchestration layer) - Q: How should beads issues encode memory context? → A: Notes field (memory IDs as JSON in `notes`, human-readable title, labels for agent type filtering) - Q: When should system auto-process vs. wait for human review? → A: Confidence threshold (auto if ≥0.9, log-only if 0.7-0.9, wait for human if <0.7) ## User Scenarios & Testing *(mandatory)* ### User Story 1 - Automatic Memory Decay Triage (Priority: P1) As a CortexGraph user, I want memories approaching the forget threshold to be automatically identified and triaged so that important information isn't lost due to decay while unimportant memories are allowed to expire naturally. **Why this priority**: This is the foundational use case - without decay analysis, all other consolidation operations lack the context needed to make intelligent decisions. The Decay Analyzer identifies which memories need attention, feeding the entire consolidation pipeline. **Independent Test**: Can be fully tested by creating memories with varying decay scores and verifying the Decay Analyzer correctly identifies those in the "danger zone" (0.15-0.35 score) and near-forget threshold (< 0.10). Delivers value by preventing accidental data loss. **Acceptance Scenarios**: 1. **Given** a memory with decay score 0.08 (below forget threshold 0.10), **When** the Decay Analyzer runs, **Then** a beads issue is created flagging this memory for review with urgency=high 2. **Given** a memory with decay score 0.25 (in danger zone), **When** the Decay Analyzer runs, **Then** a beads issue is created suggesting reinforcement or consolidation with urgency=medium 3. **Given** a memory with decay score 0.70 (healthy), **When** the Decay Analyzer runs, **Then** no action is taken for this memory 4. **Given** multiple memories approaching threshold simultaneously, **When** the Decay Analyzer runs, **Then** issues are prioritized by semantic importance (entity count, tag relevance) not just raw score --- ### User Story 2 - Intelligent Memory Clustering (Priority: P1) As a CortexGraph user, I want similar memories to be automatically detected and grouped so that redundant information can be consolidated into single, comprehensive memories. **Why this priority**: Clustering is the prerequisite for merging. Without accurate similarity detection, the system cannot identify consolidation opportunities. This directly reduces memory bloat and improves search relevance. **Independent Test**: Can be tested by inserting memories with known semantic overlap (e.g., multiple preferences about PostgreSQL) and verifying clusters are correctly formed. Delivers value by identifying merge candidates. **Acceptance Scenarios**: 1. **Given** three memories all discussing "PostgreSQL preferences", **When** the Cluster Detector runs, **Then** a cluster is created linking all three with cohesion score > 0.7 2. **Given** memories with different topics (PostgreSQL, cooking recipes, travel), **When** the Cluster Detector runs, **Then** these are NOT grouped together (cohesion < 0.4) 3. **Given** a high-cohesion cluster is detected, **When** cluster cohesion > 0.75, **Then** a beads issue is created for the Semantic Merge agent 4. **Given** a medium-cohesion cluster (0.4-0.75), **When** detected, **Then** a beads issue is created suggesting linking (not merging) --- ### User Story 3 - Semantic Memory Merging (Priority: P2) As a CortexGraph user, I want clustered memories to be intelligently merged into comprehensive single memories that preserve all unique information while eliminating redundancy. **Why this priority**: Depends on clustering (User Story 2). Merging is the core value proposition - reducing clutter while preserving information. Lower priority than detection because detection is prerequisite. **Independent Test**: Can be tested by providing a known cluster and verifying the merged output contains all unique entities, tags, and content segments from originals. Delivers value by creating cleaner, more comprehensive memories. **Acceptance Scenarios**: 1. **Given** a cluster of 3 memories about database preferences, **When** Semantic Merge processes it, **Then** a single merged memory is created containing all unique facts 2. **Given** memories with overlapping content ("I prefer PostgreSQL" + "PostgreSQL is my choice"), **When** merged, **Then** duplicate information appears only once 3. **Given** memories with distinct entities, **When** merged, **Then** all entities are preserved (union of entity sets) 4. **Given** a successful merge, **When** complete, **Then** `consolidated_from` relations are created linking to originals, and original memories are archived (not deleted) 5. **Given** merge creates new memory, **When** complete, **Then** beads issue is closed with audit trail documenting the merge --- ### User Story 4 - Long-Term Memory Promotion (Priority: P2) As a CortexGraph user, I want high-value memories to be automatically promoted to long-term storage (Obsidian vault) so that important knowledge becomes permanent and searchable outside CortexGraph. **Why this priority**: Promotion preserves the most valuable memories permanently. Depends on having accurate value assessment (influenced by decay analysis and usage patterns). Medium priority because LTM already exists - this automates an existing manual process. **Independent Test**: Can be tested by creating memories that meet promotion criteria (score > 0.65 OR use_count >= 5 within 14 days) and verifying markdown files are created in the vault. Delivers value by automating knowledge preservation. **Acceptance Scenarios**: 1. **Given** a memory with decay score 0.75 and use_count 3, **When** LTM Promoter runs, **Then** a markdown file is created in the Obsidian vault with YAML frontmatter 2. **Given** a memory with use_count 6 within 14 days, **When** LTM Promoter runs, **Then** memory is promoted regardless of decay score 3. **Given** a promoted memory, **When** promotion completes, **Then** the STM memory is marked as `status=promoted` and retains a reference to the LTM file 4. **Given** a memory already promoted, **When** LTM Promoter runs, **Then** no duplicate file is created 5. **Given** promotion succeeds, **When** complete, **Then** beads issue documents vault path and promotion criteria met --- ### User Story 5 - Relationship Discovery (Priority: P3) As a CortexGraph user, I want implicit connections between memories to be automatically discovered so that my knowledge graph becomes richer and more interconnected over time. **Why this priority**: Relationship discovery enhances the knowledge graph but doesn't prevent data loss or reduce redundancy. It's an enhancement over the core consolidation functionality. Lower priority because the graph already works - this makes it better. **Independent Test**: Can be tested by inserting memories with implicit relationships (e.g., "I use FastAPI" and "My backend uses Python") and verifying `related` relations are created. Delivers value by enriching the knowledge graph. **Acceptance Scenarios**: 1. **Given** memories "I prefer PostgreSQL for databases" and "My API connects to a PostgreSQL instance", **When** Relationship Discovery runs, **Then** a `related` relation is created between them 2. **Given** memories with shared entities but different contexts, **When** analyzed, **Then** relation strength reflects contextual similarity (not just entity match) 3. **Given** a discovered relationship, **When** relation is created, **Then** beads issue documents the reasoning (shared entities, semantic similarity score) 4. **Given** memories with no meaningful connection, **When** analyzed, **Then** no spurious relations are created --- ### Edge Cases - What happens when a memory is in multiple clusters simultaneously? - Memory can only be merged into ONE cluster; select highest cohesion cluster - Lower-cohesion clusters receive `related` links instead of merges - How does system handle merge conflicts (same entity, different values)? - Preserve both values with timestamp context: "Preferred PostgreSQL (2024-01), later switched to MySQL (2024-06)" - Flag conflicts in beads issue for human review if confidence < 0.7 - What happens when beads issue queue grows faster than processing? - Implement priority queue with urgency-based ordering - Issues older than 7 days auto-escalate priority - Rate limiting prevents agent storms - How does system handle agent failures mid-operation? - Each agent operation is atomic (all-or-nothing) - Failed operations leave beads issue in `blocked` status with error details - Retry logic with exponential backoff (max 3 retries) - What happens during concurrent agent execution? - Beads provides locking via `in_progress` status - Agents claim issues before processing - Optimistic concurrency: detect conflicts, abort and retry ## Requirements *(mandatory)* ### Functional Requirements - **FR-001**: System MUST provide a Decay Analyzer agent that identifies memories with scores below configurable threshold (default: 0.15) - **FR-002**: System MUST provide a Cluster Detector agent that groups memories with semantic similarity above configurable threshold (default: 0.7) - **FR-003**: System MUST provide a Semantic Merge agent that combines clustered memories while preserving all unique content - **FR-004**: System MUST provide an LTM Promoter agent that writes high-value memories to Obsidian vault as markdown - **FR-005**: System MUST provide a Relationship Discovery agent that identifies implicit connections between memories - **FR-006**: System MUST use beads issues as the coordination mechanism between agents (message queue pattern) - **FR-007**: System MUST create audit trail in beads for all consolidation operations (what was changed, why, by which agent) - **FR-008**: Merged memories MUST preserve `consolidated_from` relations linking to original memories - **FR-009**: Original memories MUST be archived (not deleted) after successful merge, allowing recovery - **FR-010**: System MUST support dry-run mode for all agents (preview without changes) - **FR-011**: Each agent MUST be independently executable (can run solo or as part of pipeline) - **FR-012**: System MUST provide CLI commands to trigger each agent manually: `cortexgraph-consolidate [agent-name]` - **FR-013**: System MUST support hybrid triggering: (a) scheduled scans at configurable intervals (default: hourly), and (b) event-driven triggers for urgent items (memories with score < 0.10) - **FR-014**: System MUST rate-limit agent operations to prevent overwhelming the system (configurable, default: 100 operations/minute) - **FR-015**: Agents MUST extend existing MCP tool logic (`cluster_memories`, `consolidate_memories`, `promote_memory`) rather than replacing it. Existing tools remain available for manual invocation; agents add orchestration and beads coordination. - **FR-016**: Agents MUST apply confidence-based processing thresholds: (a) confidence ≥ 0.9 → auto-process immediately, (b) confidence 0.7-0.9 → process with detailed logging, (c) confidence < 0.7 → create beads issue and wait for human review ### Key Entities - **ConsolidationAgent**: Python base class for all five specialized agents, providing common interfaces for dry-run, audit logging, and beads integration. Agents run as single-process CLI commands, not separate daemons or LLM-powered SDK agents. - **ConsolidationTask**: Beads issue representing work for an agent. Schema: human-readable `title`, memory IDs as JSON in `notes` field (e.g., `{"memory_ids": ["abc-123"], "scores": [0.08]}`), agent type via labels (`consolidation:decay`, `consolidation:cluster`, `consolidation:merge`, `consolidation:promote`, `consolidation:relations`) - **ClusterResult**: Output from Cluster Detector containing memory IDs, cohesion score, and recommended action (merge | link | ignore) - **MergeResult**: Output from Semantic Merge containing new memory ID, original IDs, content diff, and relation IDs created - **PromotionResult**: Output from LTM Promoter containing vault path, memory ID, and promotion criteria matched ## Success Criteria *(mandatory)* ### Measurable Outcomes - **SC-001**: Decay Analyzer identifies 95%+ of memories within 0.05 of forget threshold within 24 hours of entering danger zone - **SC-002**: Cluster Detector achieves precision > 0.85 (85%+ of suggested clusters are actually related memories) - **SC-003**: Semantic Merge preserves 100% of unique entities and facts from source memories (no data loss) - **SC-004**: LTM Promoter successfully creates valid markdown files for 100% of eligible memories (no silent failures) - **SC-005**: Relationship Discovery creates meaningful relations with precision > 0.80 (80%+ of relations represent genuine semantic connections) - **SC-006**: All agent operations complete within 5 seconds per memory processed (performance SLA) - **SC-007**: System maintains full audit trail allowing reconstruction of any consolidation decision - **SC-008**: Zero data loss - original memories remain recoverable for 30 days after consolidation ### Implementation Validation (Added 2025-11-25) | Criterion | Validation Method | Result | |-----------|-------------------|--------| | SC-001 | Contract tests verify scan() returns all memories with score < 0.35 (danger zone); unit tests verify urgency classification at threshold boundaries | ✅ PASS | | SC-002 | Contract tests verify ClusterResult cohesion scores; unit tests verify action thresholds (merge ≥0.75, link 0.4-0.75, ignore <0.4) | ✅ PASS | | SC-003 | Unit tests verify entity/tag union preservation; integration tests verify no content loss during merge | ✅ PASS | | SC-004 | Integration tests verify markdown file creation with valid frontmatter; contract tests verify PromotionResult contains vault_path | ✅ PASS | | SC-005 | Contract tests verify RelationResult includes reasoning; unit tests verify min_confidence threshold prevents spurious relations | ✅ PASS | | SC-006 | Performance validation: all agents complete in <10ms per memory (well under 5s SLA) | ✅ PASS | | SC-007 | Beads integration creates issues with full context in notes field; consolidated_from relations preserve provenance | ✅ PASS | | SC-008 | SemanticMerge archives originals (status=archived) rather than deleting; integration tests verify recovery | ✅ PASS |

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/prefrontalsys/mnemex'

If you have feedback or need assistance with the MCP directory API, please join our Discord server