Skip to main content
Glama
tasks.md23.2 kB
# Tasks: Natural Language Memory Activation **Feature**: 002-natural-language-activation **Input**: Design documents from `/specs/002-natural-language-activation/` **Prerequisites**: plan.md ✅, spec.md ✅, research.md ✅, data-model.md ✅, contracts/ ✅ **Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story. **Tests**: Unit and integration tests are included as this is a core feature requiring ≥90% coverage per constitution. ## Format: `[ID] [P?] [Story] Description` - **[P]**: Can run in parallel (different files, no dependencies) - **[Story]**: Which user story this task belongs to (US1, US2, US3, US4) - Include exact file paths in descriptions --- ## Phase 1: Setup (Shared Infrastructure) **Purpose**: Project initialization and module structure - [X] T001 Create `src/cortexgraph/activation/` module directory structure - [X] T002 Create `src/cortexgraph/activation/__init__.py` with public API exports - [X] T003 [P] Create `tests/unit/activation/` directory for unit tests - [X] T004 [P] Create `tests/integration/activation/` directory for integration tests - [X] T005 [P] Create `tests/contract/` directory for API contract tests - [X] T006 Create default configuration file template at `src/cortexgraph/activation/activation.yaml.example` --- ## Phase 2: Foundational (Blocking Prerequisites) **Purpose**: Core models and infrastructure that ALL user stories depend on **⚠️ CRITICAL**: No user story work can begin until this phase is complete - [X] T007 Implement `ActivationSignal` Pydantic model in `src/cortexgraph/activation/models.py` - [X] T008 [P] Implement `MessageAnalysis` Pydantic model in `src/cortexgraph/activation/models.py` - [X] T009 [P] Implement `RecallAnalysis` Pydantic model in `src/cortexgraph/activation/models.py` - [X] T010 [P] Implement `ConfidenceThreshold` config model in `src/cortexgraph/activation/config.py` - [X] T011 [P] Implement `PatternLibrary` config model in `src/cortexgraph/activation/config.py` - [X] T012 Create configuration loader with YAML parsing in `src/cortexgraph/activation/config.py` - [X] T013 Implement entity extraction using hybrid spaCy+regex in `src/cortexgraph/activation/entity_extraction.py` - [X] T014 Implement pattern matching engine with regex compilation in `src/cortexgraph/activation/patterns.py` - [X] T015 Implement confidence scoring with weighted sigmoid formula in `src/cortexgraph/activation/detectors.py` - [X] T016 Add activation configuration settings to `src/cortexgraph/config.py` - [X] T017 [P] Write unit tests for Pydantic model validation in `tests/unit/activation/test_models.py` - [X] T018 [P] Write unit tests for pattern matching logic in `tests/unit/activation/test_patterns.py` - [X] T019 [P] Write unit tests for entity extraction in `tests/unit/activation/test_entity_extraction.py` - [X] T020 [P] Write unit tests for confidence scoring in `tests/unit/activation/test_detectors.py` **Checkpoint**: ✅ Foundation ready - user story implementation can now begin in parallel --- ## Phase 3: User Story 1 - Automatic Memory Capture (Priority: P1) 🎯 MVP **Goal**: Users share important information and the system automatically detects memory-worthy content without explicit commands **Independent Test**: Share a preference like "I prefer PostgreSQL for databases" and verify it's detected with high confidence (>0.7) and suggested parameters (entities, tags, strength) ### Tests for User Story 1 > **NOTE: Write these tests FIRST, ensure they FAIL before implementation** > **CHECKPOINT (2025-11-24): All 5 test tasks complete - 68 tests pass, 1 xfail** - [X] T021 [P] [US1] Contract test for `analyze_message` tool validating JSON schema in `tests/contract/test_analyze_message_api.py` (20 tests) - [X] T022 [P] [US1] Unit test for save detection with explicit triggers in `tests/unit/activation/test_save_detection.py` (5 tests) - [X] T023 [P] [US1] Unit test for save detection with implicit signals in `tests/unit/activation/test_save_detection.py` (7 tests) - [X] T024 [P] [US1] Unit test for exclusion patterns (small talk filtering) in `tests/unit/activation/test_save_detection.py` (6 tests + 7 edge/uncertainty tests) - [X] T025 [US1] Integration test for analyze_message tool with real pattern matching in `tests/integration/activation/test_analyze_message_tool.py` (19 tests) **KEY FINDING**: Unit tests show `detect_save_intent` in activation module correctly handles "I prefer" via preference_statement signal. MCP tool needs updating to use this module (T027-T028). ### Implementation for User Story 1 - [ ] T026 [US1] Implement save detection logic in `src/cortexgraph/activation/detectors.py` (detect_save_intent function) - [ ] T027 [US1] Implement message analysis logic in `src/cortexgraph/activation/detectors.py` (analyze_message function) - [ ] T028 [US1] Create `analyze_message` MCP tool in `src/cortexgraph/tools/analyze_message.py` - [ ] T029 [US1] Register `analyze_message` tool with FastMCP server in `src/cortexgraph/server.py` - [ ] T030 [US1] Add validation and error handling for analyze_message tool - [ ] T031 [US1] Add logging for save detection decisions with confidence scores - [ ] T032 [US1] Create default save trigger patterns in `activation.yaml.example` **Checkpoint**: At this point, analyze_message tool should detect memory-worthy content and provide suggested parameters **Acceptance Verification**: 1. "I prefer PostgreSQL for databases" → should_save=True, confidence ≥0.7, entities=["postgresql"] 2. "Remember this: I use VSCode" → should_save=True, confidence ≥0.9, explicit trigger detected 3. "Nice weather today" → should_save=False, confidence <0.4, excluded as small talk 4. "I decided to use React" → should_save=True, confidence ≥0.7, decision marker detected 5. "My API endpoint is api.example.com" → should_save=True, entities=["api.example.com"] --- ## Phase 4: User Story 2 - Contextual Memory Recall (Priority: P1) **Goal**: Users ask questions referencing past conversations and the system automatically detects recall intent without explicit "search memory" commands **Independent Test**: First save a preference using US1 or manually, then ask "what did I say about authentication?" and verify the system detects recall intent (should_search=True, confidence ≥0.7) with extracted query ### Tests for User Story 2 - [ ] T033 [P] [US2] Contract test for `analyze_for_recall` tool validating JSON schema in `tests/contract/test_analyze_for_recall_api.py` - [ ] T034 [P] [US2] Unit test for recall detection with explicit triggers in `tests/unit/activation/test_recall_detection.py` - [ ] T035 [P] [US2] Unit test for recall detection with implicit signals in `tests/unit/activation/test_recall_detection.py` - [ ] T036 [P] [US2] Unit test for query extraction from natural language in `tests/unit/activation/test_recall_detection.py` - [ ] T037 [P] [US2] Unit test for exclusion patterns (general knowledge questions) in `tests/unit/activation/test_recall_detection.py` - [ ] T038 [US2] Integration test for analyze_for_recall tool with real pattern matching in `tests/integration/activation/test_analyze_for_recall_tool.py` ### Implementation for User Story 2 - [ ] T039 [US2] Implement recall detection logic in `src/cortexgraph/activation/detectors.py` (detect_recall_intent function) - [ ] T040 [US2] Implement query extraction from user messages in `src/cortexgraph/activation/detectors.py` - [ ] T041 [US2] Implement semantic query expansion with tech term mappings in `src/cortexgraph/activation/patterns.py` - [ ] T042 [US2] Implement recall analysis logic in `src/cortexgraph/activation/detectors.py` (analyze_for_recall function) - [ ] T043 [US2] Create `analyze_for_recall` MCP tool in `src/cortexgraph/tools/analyze_for_recall.py` - [ ] T044 [US2] Register `analyze_for_recall` tool with FastMCP server in `src/cortexgraph/server.py` - [ ] T045 [US2] Add validation and error handling for analyze_for_recall tool - [ ] T046 [US2] Add logging for recall detection decisions with confidence scores - [ ] T047 [US2] Create default recall trigger patterns in `activation.yaml.example` - [ ] T048 [US2] Add 50+ tech term mappings (JWT→"json web token", etc.) to `activation.yaml.example` **Checkpoint**: At this point, analyze_for_recall tool should detect recall intent and extract search queries **Acceptance Verification**: 1. "What did I say about auth methods?" → should_search=True, confidence ≥0.7, query="auth methods" 2. "Remind me of my database choice" → should_search=True, confidence ≥0.7, query="database choice" 3. "What's my API endpoint again?" → should_search=True, confidence ≥0.7, possessive marker detected 4. "What did I say about GraphQL?" → should_search=True (even if no memories exist - graceful empty result) 5. "What's the best authentication method?" → should_search=False, confidence <0.4, general knowledge pattern --- ## Phase 5: User Story 3 - Automatic Memory Reinforcement (Priority: P2) **Goal**: When users revisit previously discussed topics, the system automatically reinforces those memories through integration with existing touch_memory tool **Independent Test**: Save a memory manually, then reference it in conversation, and verify that the system detects the reference and can trigger reinforcement (this is integration with existing touch_memory - the detection is the new part) ### Tests for User Story 3 - [ ] T049 [P] [US3] Unit test for topic/entity detection in messages in `tests/unit/activation/test_reinforcement_detection.py` - [ ] T050 [P] [US3] Unit test for cross-domain usage detection via tag similarity in `tests/unit/activation/test_reinforcement_detection.py` - [ ] T051 [US3] Integration test for reinforcement workflow (detect → recommend touch_memory) in `tests/integration/activation/test_reinforcement_workflow.py` ### Implementation for User Story 3 - [ ] T052 [US3] Implement topic extraction from messages in `src/cortexgraph/activation/detectors.py` - [ ] T053 [US3] Implement cross-domain usage detection with tag Jaccard similarity in `src/cortexgraph/activation/detectors.py` - [ ] T054 [US3] Add reinforcement logic to detect when recalled memories should be touched in `src/cortexgraph/activation/detectors.py` - [ ] T055 [US3] Add reinforcement detection to analyze_message output (new field: should_reinforce with memory_ids) - [ ] T056 [US3] Add logging for reinforcement detection with cross-domain signals - [ ] T057 [US3] Update integration tests to verify touch_memory is called after recall in `tests/integration/activation/test_reinforcement_workflow.py` **Checkpoint**: At this point, the system should detect when memories are referenced and provide signals for reinforcement **Acceptance Verification**: 1. Previously saved "JWT preference", then reference JWT in new context → reinforcement signal detected 2. Discussed "PostgreSQL" 7 days ago, ask database question today → memory retrieved AND reinforcement recommended 3. Multiple auth memories exist, discuss OAuth → OAuth-specific memory reinforcement (not all auth memories) 4. Memory in danger zone (0.15-0.35 score) → prioritized for surfacing and reinforcement 5. "Based on my React choice, I'll use Next.js" → React memory reinforcement + new Next.js memory --- ## Phase 6: User Story 4 - Decision Support for Ambiguous Cases (Priority: P3) **Goal**: For borderline cases where confidence is in clarification range (0.4-0.7), provide reasoning and support LLM decision-making **Independent Test**: Submit an ambiguous statement like "I might use Redis for caching" and verify confidence falls in clarification range with clear reasoning ### Tests for User Story 4 - [ ] T058 [P] [US4] Unit test for uncertainty marker detection in `tests/unit/activation/test_ambiguous_detection.py` - [ ] T059 [P] [US4] Unit test for confidence threshold decision logic in `tests/unit/activation/test_confidence_thresholds.py` - [ ] T060 [P] [US4] Unit test for reasoning string generation in `tests/unit/activation/test_reasoning.py` - [ ] T061 [US4] Integration test for ambiguous cases with clarification flow in `tests/integration/activation/test_ambiguous_workflow.py` ### Implementation for User Story 4 - [ ] T062 [US4] Implement uncertainty marker detection in `src/cortexgraph/activation/detectors.py` - [ ] T063 [US4] Implement conditional language detection ("if", "maybe", "considering") in `src/cortexgraph/activation/detectors.py` - [ ] T064 [US4] Enhance reasoning string generation with signal breakdown in `src/cortexgraph/activation/detectors.py` - [ ] T065 [US4] Implement phrase_signals dictionary population for transparency in `src/cortexgraph/activation/detectors.py` - [ ] T066 [US4] Add strict_mode parameter support to analyze_message and analyze_for_recall tools - [ ] T067 [US4] Add context_tags parameter support to analyze_message for context-aware analysis - [ ] T068 [US4] Add available_tags parameter support to analyze_for_recall for better tag suggestions - [ ] T069 [US4] Update logging to include phrase_signals for debugging - [ ] T070 [US4] Create uncertainty marker patterns in `activation.yaml.example` **Checkpoint**: At this point, ambiguous cases should provide clear reasoning and fall in appropriate confidence ranges **Acceptance Verification**: 1. "I might use Redis for caching" → confidence 0.4-0.6, uncertainty markers detected, reasoning explains low confidence 2. "What's the best authentication method?" → should_search=False or confidence 0.4-0.6, ambiguous (memory vs general knowledge) 3. Strict mode with "I prefer PostgreSQL" → should_save=False (no explicit trigger in strict mode) 4. Context-aware: "That's my choice" with context_tags=["database"] → better confidence through context 5. Feedback integration: "Don't remember that" → explicit negative command handled --- ## Phase 7: CLI Tools & Documentation **Purpose**: Developer tools and user-facing documentation - [ ] T071 [P] Create `cortexgraph-activation` CLI entry point in `src/cortexgraph/cli/activation.py` - [ ] T072 [P] Implement `analyze` subcommand for testing pattern detection in `src/cortexgraph/cli/activation.py` - [ ] T073 [P] Implement `test` subcommand for pattern validation in `src/cortexgraph/cli/activation.py` - [ ] T074 [P] Add CLI command registration to `pyproject.toml` console_scripts - [ ] T075 [P] Update `README.md` with natural language activation overview and links to quickstart - [ ] T076 [P] Update `docs/api.md` with analyze_message and analyze_for_recall tool documentation - [ ] T077 [P] Create `docs/activation.md` with detailed activation system documentation - [ ] T078 [P] Validate all examples in `quickstart.md` work with implementation --- ## Phase 8: Polish & Cross-Cutting Concerns **Purpose**: Quality improvements and final validation - [ ] T079 [P] Add hot-reload support using watchdog library for `activation.yaml` in `src/cortexgraph/activation/config.py` - [ ] T080 [P] Performance benchmarking: verify analyze_message <50ms p95 in `tests/performance/test_activation_performance.py` - [ ] T081 [P] Performance benchmarking: verify analyze_for_recall <50ms p95 in `tests/performance/test_activation_performance.py` - [ ] T082 [P] Performance benchmarking: verify combined workflow <300ms p95 in `tests/performance/test_activation_performance.py` - [ ] T083 [P] Security review: ensure no sensitive data leaks in reasoning strings - [ ] T084 [P] Security review: validate pattern injection prevention - [ ] T085 Code cleanup: remove any debug logging, finalize error messages - [ ] T086 Run mypy type checking with zero errors on activation module - [ ] T087 Run ruff linting with zero errors on activation module - [ ] T088 Run pytest with ≥90% coverage target on activation module - [ ] T089 Update `CHANGELOG.md` with v0.7.0 natural language activation feature - [ ] T090 Run all quickstart.md examples as final validation --- ## Dependencies & Execution Order ### Phase Dependencies - **Setup (Phase 1)**: No dependencies - can start immediately - **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories - **User Story 1 (Phase 3)**: Depends on Foundational phase - Can start once Phase 2 complete - **User Story 2 (Phase 4)**: Depends on Foundational phase - Can start in parallel with US1 (different files) - **User Story 3 (Phase 5)**: Depends on Foundational phase AND US2 (needs recall detection for reinforcement signals) - **User Story 4 (Phase 6)**: Depends on Foundational phase AND US1/US2 (enhances existing detection logic) - **CLI Tools (Phase 7)**: Depends on US1 and US2 being complete (needs both tools to work) - **Polish (Phase 8)**: Depends on all user stories being complete ### User Story Dependencies ``` Foundation (Phase 2) ├── User Story 1 (US1) - Automatic Memory Capture [P1] ✓ Independent ├── User Story 2 (US2) - Contextual Memory Recall [P1] ✓ Independent │ └── User Story 3 (US3) - Automatic Memory Reinforcement [P2] (needs US2 for recall detection) └── User Story 4 (US4) - Decision Support for Ambiguous Cases [P3] (enhances US1/US2) ``` **Key insight**: US1 and US2 are fully independent and can be developed in parallel. US3 builds on US2 (needs recall to trigger reinforcement). US4 enhances all previous stories. ### Within Each User Story - Tests MUST be written and FAIL before implementation - Models (in Foundational) before detectors - Detectors before MCP tools - MCP tools before tool registration - Core implementation before integration tests ### Parallel Opportunities **Setup Phase (Phase 1)**: Tasks T003, T004, T005, T006 all [P] - different directories **Foundational Phase (Phase 2)**: - T008, T009, T010, T011 all [P] - different models in same file (can be written simultaneously) - T017, T018, T019, T020 all [P] - different test files **User Story 1 (Phase 3)**: - T021, T022, T023, T024 all [P] - different test files/functions **User Story 2 (Phase 4)**: - T033, T034, T035, T036, T037 all [P] - different test files/functions - Can run ENTIRE US2 in parallel with US1 (different files, no dependencies) **User Story 4 (Phase 6)**: - T058, T059, T060 all [P] - different test files **CLI Phase (Phase 7)**: - T071, T072, T073, T074, T075, T076, T077, T078 all [P] - different files **Polish Phase (Phase 8)**: - T079, T080, T081, T082, T083, T084 all [P] - different concerns --- ## Parallel Example: User Story 1 + User Story 2 ```bash # These two user stories can be developed completely in parallel: # Developer A: User Story 1 (Automatic Memory Capture) Tasks T021-T032: analyze_message tool + save detection Files: tests/contract/test_analyze_message_api.py tests/unit/activation/test_save_detection.py tests/integration/activation/test_analyze_message_tool.py src/cortexgraph/activation/detectors.py (detect_save_intent, analyze_message) src/cortexgraph/tools/analyze_message.py # Developer B: User Story 2 (Contextual Memory Recall) Tasks T033-T048: analyze_for_recall tool + recall detection Files: tests/contract/test_analyze_for_recall_api.py tests/unit/activation/test_recall_detection.py tests/integration/activation/test_analyze_for_recall_tool.py src/cortexgraph/activation/detectors.py (detect_recall_intent, analyze_for_recall) src/cortexgraph/activation/patterns.py (query expansion) src/cortexgraph/tools/analyze_for_recall.py # NO file conflicts - completely parallel development! ``` --- ## Implementation Strategy ### MVP First (User Story 1 Only) 1. Complete Phase 1: Setup (6 tasks) 2. Complete Phase 2: Foundational (14 tasks) - CRITICAL foundation 3. Complete Phase 3: User Story 1 (12 tasks) 4. **STOP and VALIDATE**: Test analyze_message tool independently - "I prefer PostgreSQL" → should_save=True, confidence ≥0.7 - "Nice weather today" → should_save=False 5. Deploy/demo if ready - **system can now detect memory-worthy content!** **MVP Delivered**: Users can have natural conversations and the system detects when information should be saved, providing confidence scores and suggested parameters. This is immediately useful even without auto-recall. ### Incremental Delivery 1. **Foundation** (Phase 1+2): 20 tasks → All models, patterns, config ready 2. **MVP** (+ Phase 3): 12 tasks → Automatic memory capture working (analyze_message tool) 3. **Enhanced** (+ Phase 4): 16 tasks → Add automatic recall (analyze_for_recall tool) 4. **Intelligent** (+ Phase 5): 9 tasks → Add automatic reinforcement 5. **Polished** (+ Phase 6): 12 tasks → Handle ambiguous cases with reasoning 6. **Production-Ready** (+ Phase 7+8): 20 tasks → CLI tools, docs, performance validation **Total**: 89 tasks across 8 phases ### Parallel Team Strategy With 2-3 developers: 1. **Day 1-2**: Everyone completes Setup + Foundational together (20 tasks, foundation MUST be complete) 2. **Day 3-5**: Once Foundational done: - Developer A: User Story 1 (12 tasks) - analyze_message tool - Developer B: User Story 2 (16 tasks) - analyze_for_recall tool - No conflicts! Different files, parallel work 3. **Day 6**: Integration - verify US1 and US2 both work 4. **Day 7-8**: Developer C adds User Story 3 (9 tasks) while A+B do User Story 4 (12 tasks) 5. **Day 9-10**: Everyone works on CLI, docs, polish (20 tasks in parallel) **Timeline**: ~10 days with 2-3 developers working efficiently --- ## Task Count Summary | Phase | Task Count | Can Parallelize | |-------|-----------|----------------| | Phase 1: Setup | 6 tasks | 4 tasks (67%) | | Phase 2: Foundational | 14 tasks | 7 tasks (50%) | | Phase 3: User Story 1 (P1) | 12 tasks | 5 tasks (42%) | | Phase 4: User Story 2 (P1) | 16 tasks | 7 tasks (44%) | | Phase 5: User Story 3 (P2) | 9 tasks | 3 tasks (33%) | | Phase 6: User Story 4 (P3) | 12 tasks | 4 tasks (33%) | | Phase 7: CLI & Docs | 8 tasks | 8 tasks (100%) | | Phase 8: Polish | 12 tasks | 6 tasks (50%) | | **Total** | **89 tasks** | **44 tasks (49%)** | **MVP Scope** (US1 only): 32 tasks (Setup + Foundational + US1) **Core Features** (US1 + US2): 48 tasks (add US2) **Full Feature** (All User Stories): 81 tasks (add US3 + US4 + CLI + Polish) --- ## Notes - [P] tasks = different files, no dependencies - can run in parallel - [Story] label maps task to specific user story for traceability - Each user story should be independently completable and testable - Verify tests fail before implementing - Commit after each task or logical group of related tasks - Stop at any checkpoint to validate story independently - Foundation phase is CRITICAL - no shortcuts, this blocks everything - US1 and US2 have no dependencies on each other - true parallel development - Pattern matching is deterministic - confidence thresholds may need tuning during testing - All file paths assume single project structure (`src/cortexgraph/`, `tests/`) --- **Generated**: 2025-01-24 **Feature**: 002-natural-language-activation **Total Tasks**: 89 tasks across 8 phases **MVP Tasks**: 32 tasks (Setup + Foundation + US1) **Parallel Opportunities**: 44 tasks (49% can run in parallel)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/prefrontalsys/mnemex'

If you have feedback or need assistance with the MCP directory API, please join our Discord server