Markdown RAG Documentation

Overview Schema Related Servers Score Discussions

16-memory-management-plan.md•15.4 KiB

# Implementation Plan: Memory Management System (Spec 16) ## Executive Summary This plan implements a dual-lane Memory Management System enabling AI assistants to maintain project-specific Memory Banks separate from the main documentation corpus. The architecture replicates the existing Vector/Keyword/Graph indexing stack with memory-specific extensions: Ghost Nodes for cross-corpus linking, Typed Edges with anchor context, and configurable Recency Boost. Nine MCP tools provide CRUD, search, and maintenance operations. Implementation spans 5 phases (~1,800 LOC) with each phase delivering testable increments. --- ## 1. Implementation Phases ### Phase 1: Configuration & Storage Foundation **Goal**: Extend configuration and establish memory storage paths. **LOC Estimate**: ~200 | Step | Description | |------|-------------| | 1.1 | Add `MemoryConfig` dataclass to [src/config.py](../../src/config.py) | | 1.2 | Add `[memory]` section parsing in `load_config()` | | 1.3 | Create `resolve_memory_path()` function for project vs user storage | | 1.4 | Add `memory` field to `Config` dataclass | **Acceptance Criteria**: - [ ] `config.memory.enabled` returns bool - [ ] `config.memory.storage_strategy` is `"project"` or `"user"` - [ ] `resolve_memory_path()` returns correct path for both strategies - [ ] Config loads from TOML without errors --- ### Phase 2: Memory Index Manager **Goal**: Create a dedicated `MemoryIndexManager` that mirrors `IndexManager` patterns. **LOC Estimate**: ~350 | Step | Description | |------|-------------| | 2.1 | Create `src/memory/` package directory | | 2.2 | Create `src/memory/manager.py` with `MemoryIndexManager` class | | 2.3 | Implement `_compute_memory_id()` with `memory:` prefix | | 2.4 | Implement `index_memory()`, `remove_memory()`, `persist()`, `load()` | | 2.5 | Create `src/memory/models.py` with `MemoryDocument` model | | 2.6 | Implement YAML frontmatter parsing (`type`, `status`, `tags`, `created_at`) | **Acceptance Criteria**: - [ ] `MemoryIndexManager` indexes memory files with correct ID prefix - [ ] Frontmatter metadata extracted into chunk metadata - [ ] Indices persist/load to `memories/indices/` subdirectory - [ ] Memory documents use `HeaderChunker` --- ### Phase 3: Ghost Nodes & Typed Edges **Goal**: Extend `GraphStore` for cross-corpus linking with context. **LOC Estimate**: ~400 | Step | Description | |------|-------------| | 3.1 | Add `edge_context: str` parameter to `GraphStore.add_edge()` | | 3.2 | Implement ghost node creation in `MemoryIndexManager` for `[[target]]` links | | 3.3 | Extract ~100 chars anchor context around each link | | 3.4 | Infer edge type from context keywords (`mentions`, `refactors`, `plans`, `related_to`) | | 3.5 | Add `get_edges_to()` method for reverse lookup (ghost → memory) | | 3.6 | Update `GraphStore.persist()/load()` to handle edge attributes | **Acceptance Criteria**: - [ ] Links like `[[src/server.py]]` create `ghost:src/server.py` node - [ ] Edges store `edge_type` and `edge_context` attributes - [ ] `get_edges_to("ghost:X")` returns all memories linking to X - [ ] Edge attributes survive persist/load cycle --- ### Phase 4: Memory Search Orchestrator with Recency Boost **Goal**: Implement memory-specific hybrid search with chronological boosting. **LOC Estimate**: ~400 | Step | Description | |------|-------------| | 4.1 | Create `src/memory/search.py` with `MemorySearchOrchestrator` | | 4.2 | Implement `search_memories()` with hybrid search (reuse fusion logic) | | 4.3 | Add memory-specific recency boost tiers from config | | 4.4 | Implement `search_linked_memories()` using ghost node traversal | | 4.5 | Add tag/type filtering to search pipeline | | 4.6 | Return anchor context in linked memory results | **Acceptance Criteria**: - [ ] `search_memories()` returns results with recency boost applied - [ ] `search_linked_memories("query", "src/server.py")` returns memories linking to target - [ ] Results include `anchor_context` field explaining link - [ ] Tag/type filters narrow results correctly --- ### Phase 5: MCP Tools & Context Integration **Goal**: Register all 9 tools and integrate with `ApplicationContext`. **LOC Estimate**: ~450 | Step | Description | |------|-------------| | 5.1 | Update `ApplicationContext` to hold `memory_manager: MemoryIndexManager | None` | | 5.2 | Initialize memory manager in `ApplicationContext.create()` when enabled | | 5.3 | Create `src/memory/tools.py` with CRUD tool implementations | | 5.4 | Implement `create_memory()`, `append_memory()`, `read_memory()`, `update_memory()` | | 5.5 | Implement `delete_memory()` (move to `.trash/`) | | 5.6 | Implement `search_memories()`, `search_linked_memories()` tool handlers | | 5.7 | Implement `get_memory_stats()`, `merge_memories()` | | 5.8 | Register all tools in `MCPServer.list_tools()` and `MCPServer.call_tool()` | **Acceptance Criteria**: - [ ] All 9 tools appear in MCP tool list - [ ] CRUD operations modify files correctly - [ ] Delete moves to `.trash/` instead of hard delete - [ ] `merge_memories()` consolidates files and deletes sources - [ ] `get_memory_stats()` returns accurate counts --- ## 2. File Manifest ### New Files | File | Purpose | |------|---------| | `src/memory/__init__.py` | Package marker (empty) | | `src/memory/manager.py` | `MemoryIndexManager` class | | `src/memory/models.py` | `MemoryDocument`, `MemoryFrontmatter` models | | `src/memory/search.py` | `MemorySearchOrchestrator` with recency boost | | `src/memory/tools.py` | CRUD and search tool implementations | | `src/memory/link_parser.py` | Extract `[[links]]` and anchor context | | `tests/unit/test_memory_manager.py` | Unit tests for manager | | `tests/unit/test_memory_link_parser.py` | Unit tests for link extraction | | `tests/unit/test_memory_search.py` | Unit tests for search orchestrator | | `tests/integration/test_memory_tools.py` | Integration tests for MCP tools | | `tests/integration/test_memory_ghost_nodes.py` | Integration tests for graph linking | ### Modified Files | File | Changes | |------|---------| | [src/config.py](../../src/config.py) | Add `MemoryConfig` dataclass, `[memory]` loading | | [src/context.py](../../src/context.py) | Add `memory_manager` field, initialization | | [src/mcp_server.py](../../src/mcp_server.py) | Register 9 memory tools | | [src/indices/graph.py](../../src/indices/graph.py) | Add `edge_context` to `add_edge()`, `get_edges_to()` | | [src/search/fusion.py](../../src/search/fusion.py) | Extract recency boost to reusable function | --- ## 3. Dependency Graph ``` Phase 1 (Config) │ ▼ Phase 2 (Manager) ◄──────┐ │ │ ▼ │ Phase 3 (Graph) ─────────┘ │ ▼ Phase 4 (Search) │ ▼ Phase 5 (Tools) ``` **Notes**: - Phase 2 depends on Phase 1 for config access - Phase 3 modifies shared `GraphStore` (used by Phase 2) - Phase 4 uses components from Phase 2 and Phase 3 - Phase 5 integrates everything --- ## 4. Function Signatures ### Config (Phase 1) ```python # src/config.py @dataclass class MemoryConfig: enabled: bool storage_strategy: str # "project" | "user" recency_boost_days: int recency_boost_factor: float def resolve_memory_path(config: Config, project_name: str | None) -> Path: ... ``` ### Manager (Phase 2) ```python # src/memory/manager.py class MemoryIndexManager: def __init__( self, config: Config, vector: VectorIndex, keyword: KeywordIndex, graph: GraphStore, ): ... def index_memory(self, file_path: str): ... def remove_memory(self, memory_id: str): ... def persist(self): ... def load(self): ... def get_memory_count(self) -> int: ... ``` ### Models (Phase 2) ```python # src/memory/models.py @dataclass class MemoryFrontmatter: type: str # "plan" | "journal" | "fact" | "observation" | "reflection" status: str # "active" | "archived" tags: list[str] created_at: datetime @dataclass class MemoryDocument: id: str content: str frontmatter: MemoryFrontmatter links: list[str] file_path: str ``` ### Graph Extensions (Phase 3) ```python # src/indices/graph.py (existing, modified) class GraphStore: def add_edge( self, source: str, target: str, edge_type: str, edge_context: str = "", ) -> None: ... def get_edges_to(self, target: str) -> list[dict[str, str]]: ... ``` ### Link Parser (Phase 3) ```python # src/memory/link_parser.py @dataclass class ExtractedLink: target: str edge_type: str anchor_context: str position: int def extract_links(content: str, context_chars: int = 100) -> list[ExtractedLink]: ... def infer_edge_type(context: str) -> str: ... ``` ### Search (Phase 4) ```python # src/memory/search.py class MemorySearchOrchestrator: def __init__( self, vector: VectorIndex, keyword: KeywordIndex, graph: GraphStore, config: Config, manager: MemoryIndexManager, ): ... async def search_memories( self, query: str, limit: int = 5, filter_type: str | None = None, ) -> list[MemorySearchResult]: ... async def search_linked_memories( self, query: str, target_document: str, limit: int = 5, ) -> list[LinkedMemoryResult]: ... ``` ### Tools (Phase 5) ```python # src/memory/tools.py async def create_memory( ctx: ApplicationContext, filename: str, content: str, tags: list[str], memory_type: str = "journal", ) -> dict[str, str]: ... async def append_memory( ctx: ApplicationContext, filename: str, content: str, ) -> dict[str, str]: ... async def read_memory( ctx: ApplicationContext, filename: str, ) -> dict[str, str]: ... async def update_memory( ctx: ApplicationContext, filename: str, content: str, ) -> dict[str, str]: ... async def delete_memory( ctx: ApplicationContext, filename: str, ) -> dict[str, str]: ... async def search_memories( ctx: ApplicationContext, query: str, limit: int = 5, filter_tags: list[str] | None = None, filter_type: str | None = None, ) -> list[dict]: ... async def search_linked_memories( ctx: ApplicationContext, query: str, target_document: str, ) -> list[dict]: ... async def get_memory_stats(ctx: ApplicationContext) -> dict: ... async def merge_memories( ctx: ApplicationContext, source_files: list[str], target_file: str, summary_content: str, ) -> dict[str, str]: ... ``` --- ## 5. Risk Register | Risk | Severity | Probability | Mitigation | |------|----------|-------------|------------| | Ghost node pollution in main graph | High | Medium | Use separate `MemoryGraph` instance, not shared `GraphStore` | | Edge attribute migration breaks existing graphs | High | Low | Version graph JSON schema; add migration in `load()` | | Recency boost config conflicts with main search | Medium | Low | Use `MemoryConfig.recency_*` fields, not `SearchConfig` | | Memory path resolution incorrect for user strategy | Medium | Medium | Unit test both strategies with mocked `$XDG_DATA_HOME` | | YAML frontmatter parsing edge cases | Medium | Medium | Use `pyyaml` safe_load, strict schema validation | | Large memory files slow indexing | Low | Low | Warn if >50KB; suggest splitting | --- ## 6. Testing Strategy ### Phase 1 Tests ``` tests/unit/test_config.py ├── test_memory_config_defaults ├── test_memory_config_from_toml ├── test_resolve_memory_path_project_strategy └── test_resolve_memory_path_user_strategy ``` ### Phase 2 Tests ``` tests/unit/test_memory_manager.py ├── test_index_memory_creates_chunks ├── test_memory_id_prefix ├── test_frontmatter_extraction ├── test_persist_load_cycle └── test_remove_memory tests/unit/test_memory_models.py ├── test_memory_frontmatter_validation └── test_memory_document_links_parsed ``` ### Phase 3 Tests ``` tests/unit/test_memory_link_parser.py ├── test_extract_links_basic ├── test_extract_links_multiple ├── test_anchor_context_boundaries ├── test_infer_edge_type_refactor ├── test_infer_edge_type_default └── test_no_links_returns_empty tests/integration/test_memory_ghost_nodes.py ├── test_ghost_node_created_for_link ├── test_edge_has_context_attribute ├── test_get_edges_to_returns_sources └── test_edge_attributes_persist ``` ### Phase 4 Tests ``` tests/unit/test_memory_search.py ├── test_search_memories_basic ├── test_search_memories_with_recency_boost ├── test_search_memories_filter_tags ├── test_search_memories_filter_type └── test_recency_boost_configurable tests/integration/test_memory_search_linked.py ├── test_search_linked_memories_finds_linkers ├── test_search_linked_memories_includes_anchor └── test_search_linked_memories_empty_when_no_links ``` ### Phase 5 Tests ``` tests/integration/test_memory_tools.py ├── test_create_memory_success ├── test_create_memory_exists_fails ├── test_append_memory_adds_content ├── test_read_memory_returns_content ├── test_update_memory_replaces_content ├── test_delete_memory_moves_to_trash ├── test_get_memory_stats_counts └── test_merge_memories_consolidates tests/e2e/test_memory_mcp.py ├── test_mcp_list_tools_includes_memory ├── test_mcp_create_and_search_memory └── test_mcp_linked_memory_search ``` --- ## 7. Implementation Notes ### Reuse Patterns from Existing Code 1. **IndexManager pattern**: `MemoryIndexManager` mirrors [src/indexing/manager.py](../../src/indexing/manager.py) structure 2. **SearchOrchestrator pattern**: `MemorySearchOrchestrator` reuses fusion from [src/search/fusion.py](../../src/search/fusion.py) 3. **Config loading**: Follow `load_config()` pattern in [src/config.py](../../src/config.py) 4. **MCP registration**: Follow `list_tools()`/`call_tool()` pattern in [src/mcp_server.py](../../src/mcp_server.py) 5. **Test fixtures**: Use `tmp_path`, `shared_embedding_model` from [tests/conftest.py](../../tests/conftest.py) ### Storage Strategy Details **Project Strategy (`storage_strategy = "project"`)**: - Path: `{project_root}/.memories/` - Indices: `{project_root}/.memories/indices/` - Trash: `{project_root}/.memories/.trash/` **User Strategy (`storage_strategy = "user"`)**: - Path: `$XDG_DATA_HOME/mcp-markdown-ragdocs/{project_name}/memories/` - Indices: `$XDG_DATA_HOME/mcp-markdown-ragdocs/{project_name}/memories/indices/` - Trash: `$XDG_DATA_HOME/mcp-markdown-ragdocs/{project_name}/memories/.trash/` ### Edge Type Inference Rules | Context Contains | Edge Type | |-----------------|-----------| | "refactor", "rewrite", "restructure" | `refactors` | | "plan", "todo", "will", "should" | `plans` | | "bug", "fix", "issue", "error" | `debugs` | | "note", "remember", "mention" | `mentions` | | (default) | `related_to` | --- ## 8. Open Questions 1. **Q**: Should memory search results appear in main `query_documents` when relevant? **Recommendation**: No. Keep corpora strictly separate. Add explicit `include_memories: bool` param in future version. 2. **Q**: How to handle orphaned ghost nodes when linked document is deleted from main corpus? **Recommendation**: Ghost nodes are lightweight references. Keep them; they indicate historical context. 3. **Q**: Should `merge_memories` preserve frontmatter from sources? **Recommendation**: No. The `summary_content` param should include new frontmatter. Document this in tool description.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/andnp/ragdocs-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

16-memory-management-plan.md•15.4 KiB