Mnemex

Overview Schema Related Servers Score Discussions

mnemex

RCA_EFIT_MAPPING.md•22.1 KiB

# Root Cause Analysis ↔ eFIT Protocol Mapping **Purpose**: Direct mapping between established RCA methodologies and eFIT protocols **Use**: Reference when implementing RCA techniques in AI systems **Date**: 2025-11-17 --- ## Convergent Evolution Summary This document demonstrates convergent evolution between clinical psychology (DBT/CBT) and established engineering RCA practices, validating the computational homology thesis. --- ## Methodology → eFIT Protocol Mappings ### 1. Five Whys → STOPPER "Pull back" **Convergent Principle**: Iterative deepening to root causes **Shared Design**: - Both resist superficial fixes - Both require cognitive discipline to avoid premature closure - Both trace causality backward from symptoms - Both continue until reaching systemic/process-level causes **Process Comparison**: | Five Whys | STOPPER "Pull back" | |-----------|---------------------| | Ask "Why?" 5 times | Step back from immediate problem | | Trace backward through causality | Identify underlying patterns | | Stop at process/system root | Recognize systemic issues | | Evidence-based answers | Observation-based understanding | **Implementation**: ```python def five_whys_pull_back(symptom: str, max_depth: int = 5) -> RootCause: """ Combine Five Whys with STOPPER Pull back """ current = symptom for depth in range(max_depth): # STOPPER: Pull back (don't accept surface answer) why_answer = ask_why(current, require_evidence=True) if is_process_root(why_answer): return RootCause(cause=why_answer, depth=depth+1) current = why_answer return RootCause(cause=current, depth=max_depth, incomplete=True) ``` **Sources**: - Toyota Production System: https://www.orcalean.com/article/how-toyota-is-using-5-whys-method - Buffer guide: https://buffer.com/resources/5-whys-process/ --- ### 2. Fishbone Diagram → STOPPER "Observe" + "Take a step back" **Convergent Principle**: Systematic categorization prevents tunnel vision **Shared Design**: - Both examine multiple domains before narrowing - Both reduce cognitive load through structure - Both prevent jumping to obvious (but incomplete) solutions - Both surface hidden contributing factors **Process Comparison**: | Fishbone | STOPPER "Observe"/"Take a step back" | |----------|--------------------------------------| | 6 Ms categorization | Systematic observation framework | | Visual structure reduces overwhelm | External scaffolding for executive function | | Cross-domain brainstorming | Holistic context gathering | | Prioritize after full mapping | Act after understanding system | **AI System Categories** (adapted from 6 Ms): 1. **Code** (logic, algorithms) 2. **Configuration** (settings, parameters) 3. **Context** (conversation history, memory) 4. **Capacity** (resources, rate limits) 5. **Coordination** (tool calls, async operations) 6. **Cognition** (model reasoning, prompts) **Implementation**: ```python class FishboneDiagram: """ Fishbone RCA with STOPPER integration """ categories = ["Code", "Config", "Context", "Capacity", "Coordination", "Cognition"] def observe_systematically(self, problem: str) -> Dict[str, List[str]]: """ STOPPER Observe: Gather data across all categories """ causes = {} for category in self.categories: # Take a step back: Don't dive into first category causes[category] = self.brainstorm_causes(problem, category) return causes def prioritize_for_investigation(self, causes: Dict) -> List[Cause]: """ STOPPER Pull back: Identify most promising root causes """ scored = [] for category, cause_list in causes.items(): for cause in cause_list: score = self.evidence_score(cause) * self.impact_score(cause) scored.append((score, category, cause)) return sorted(scored, reverse=True)[:5] # Top 5 ``` **Sources**: - ASQ Fishbone guide: https://asq.org/quality-resources/fishbone - GoLeanSixSigma: https://goleansixsigma.com/fishbone-diagram/ --- ### 3. Blameless Postmortems → DBT Non-judgmental Stance + STOPPER "Practice what works" **Convergent Principle**: Systems thinking over individual blame enables learning **Shared Design**: - Both prioritize understanding over punishment - Both assume good intentions despite negative outcomes - Both focus on system/environment factors - Both document what works for future use **Process Comparison**: | Blameless Postmortem | DBT/STOPPER | |---------------------|-------------| | "Everyone had good intentions" | Non-judgmental stance (observe facts) | | Focus on system conditions | Recognize environmental factors | | Document what worked | "Practice what works" | | Psychological safety enables disclosure | Acceptance enables honest self-assessment | **Cultural Implementation**: | Google SRE Practice | DBT/eFIT Equivalent | |--------------------|---------------------| | Use role titles not names | Observe behavior without judgment of person | | "Where we got lucky" section | Recognize factors beyond control (radical acceptance) | | Action items with owners | Commitment to behavioral change | | Monthly postmortem features | Regular review/reflection practice | **Template Mapping**: ```markdown ## Blameless Postmortem (eFIT-Enhanced) **Summary**: [What happened - facts only, no blame] **Impact**: [Observable effects on users/system] **Timeline**: [Events without evaluative language] | Time | Event | Action Taken | |------|-------|--------------| | [HH:MM] | [Fact] | [Response] | **Root Causes**: [System conditions, not people] - Environmental factors: [What made this possible?] - Process gaps: [What procedures were missing?] - System vulnerabilities: [What weakness was exploited?] **STOPPER Analysis**: - **Stop**: When did we recognize the problem? - **Take a step back**: What context was initially missed? - **Observe**: What data did we gather? - **Pull back**: What was the root cause? - **Practice what works**: What actions succeeded? - **Expand**: What should be shared org-wide? - **Restart**: How do we prevent recurrence? **What Went Well** (DBT: Build on strengths) - [Success 1] - [Success 2] **What Went Wrong** (Non-judgmental observation) - [Gap 1] - [Gap 2] **Where We Got Lucky** (Radical acceptance of uncontrollables) - [Near-miss 1] - [Mitigating factor 1] **Action Items** (Commitment to change) | Action | Owner | Type | Due | Status | |--------|-------|------|-----|--------| | [Fix X] | [Role] | Prevent | [Date] | [Status] | ``` **Sources**: - Google SRE Book: https://sre.google/sre-book/postmortem-culture/ - Etsy blameless culture: https://www.etsy.com/codeascraft/blameless-postmortems --- ### 4. Systematic Debugging → STOPPER (Full Protocol) **Convergent Principle**: Structured process reduces cognitive load during problem-solving **Shared Design**: - Both provide external scaffolding for executive function - Both prevent reactive, ineffective responses - Both emphasize understanding before action - Both iteratively refine hypotheses **Step-by-Step Mapping**: | Debugging Step | STOPPER Step | Convergent Principle | |----------------|--------------|----------------------| | 1. Identify symptoms | **Stop** | Recognize problem state | | 2. Reproduce bug | **Observe** | Gather reliable data | | 3. Understand systems | **Take a step back** | See full context before code diving | | 4. Form hypothesis | **Pull back** | Root cause thinking | | 5. Test hypothesis | **Practice what works** | Evidence-based validation | | 6. Fix and verify | **Restart** | Return with solution | | *Overall process* | **Expand** | Cognitive load reduction via structure | **Implementation Integration**: ```python class STOPPERDebugger: """ Systematic debugging with STOPPER protocol """ def debug(self, error: Error) -> Solution: # Step 1: Identify symptoms → STOPPER "Stop" self.stop_reactive_debugging() symptoms = self.identify_symptoms(error) # Step 2: Reproduce → STOPPER "Observe" reproduction = self.create_minimal_repro(symptoms) # Step 3: Understand systems → STOPPER "Take a step back" context = self.gather_context( architecture=True, recent_changes=True, logs=True, skip_code_dive=True # Key principle ) # Step 4: Form hypothesis → STOPPER "Pull back" hypotheses = self.generate_hypotheses( symptoms, reproduction, context ) # Step 5: Test → STOPPER "Practice what works" for hypothesis in hypotheses: if self.test_hypothesis(hypothesis): # Step 6: Fix → STOPPER "Restart" solution = self.implement_fix(hypothesis) self.verify_fix(solution) return solution # No solution found → Expand (seek help) return self.escalate_for_assistance() def stop_reactive_debugging(self): """ STOPPER "Stop": Pause before diving into code """ time.sleep(60) # Literal pause self.log("Pausing reactive debugging urge") self.log("Will understand context before code diving") ``` **Key Principle (Convergent)**: > "The instinct is to dive right into the code, but understanding context first prevents wasted effort." > — Systematic Debugging > "Stop. Take a step back. Observe the full situation before acting." > — STOPPER Protocol Both resist the immediate action urge in favor of understanding. **Sources**: - Nicole Tietz: https://ntietz.com/blog/how-i-debug-2023/ --- ### 5. Cause Elimination → STOPPER "Pull back" (Scientific Method) **Convergent Principle**: Hypothesis testing to isolate root causes **Shared Design**: - Both use systematic elimination - Both require testable hypotheses - Both rely on evidence over intuition - Both iterate until convergence **Process Comparison**: | Cause Elimination | STOPPER "Pull back" | |------------------|---------------------| | List potential causes | Generate hypotheses about root causes | | Test each systematically | Validate against evidence | | Eliminate disproven causes | Narrow to actual root | | Converge on root cause | Identify underlying pattern | **Scientific Method Integration** (from e-fit-research): ```python def cause_elimination_with_stopper(symptom: str) -> RootCause: """ Combine cause elimination with STOPPER Pull back """ # STOPPER "Pull back": Generate hypotheses potential_causes = generate_hypotheses(symptom) # Scientific method: Test each for cause in potential_causes: # Design test test = design_experiment(cause) # Run test result = execute_test(test) # Evaluate if result.disproves(cause): potential_causes.remove(cause) elif result.confirms(cause): return RootCause(cause=cause, evidence=result) # If multiple causes remain, may be interaction effect return RootCause(causes=potential_causes, interaction=True) ``` **Sources**: - GeeksforGeeks: https://www.geeksforgeeks.org/software-engineering-debugging-approaches/ --- ### 6. FMEA → STOPPER "Pull back" + "Practice what works" **Convergent Principle**: Proactive failure prevention through root cause analysis **Shared Design**: - Both analyze causes before effects manifest - Both trace from symptoms to system flaws - Both prevent problems rather than react - Both build on historical failure data **Process Comparison**: | FMEA | STOPPER | |------|---------| | Identify potential failure modes | Pull back: Anticipate failure patterns | | Analyze effects and causes | Pull back: Trace causality | | Risk Priority Number (Severity × Occurrence × Detection) | Prioritize high-impact, high-frequency issues | | Implement preventive actions | Practice what works: Apply known mitigations | | Re-assess after changes | Expand: Update knowledge base | **AI System FMEA Template**: ```markdown ## AI System FMEA (STOPPER-Enhanced) ### Failure Mode: [Specific failure] - **Component**: [System part] - **Function**: [What it should do] - **Failure Effect**: [Impact if it fails] ### STOPPER Root Cause Analysis: - **Pull back**: Why would this fail? - Cause 1: [Root cause with evidence] - Cause 2: [Alternative root cause] ### Risk Assessment: - **Severity** (1-10): [User impact] - **Occurrence** (1-10): [Frequency] - **Detection** (1-10): [How hard to catch] - **RPN**: [S × O × D] ### STOPPER Prevention: - **Practice what works**: [Known mitigation from past] - **Expand**: [Share this pattern org-wide] - **Action Owner**: [Role/team] - **Due Date**: [Deadline] ### Post-Implementation: - **New Severity**: [After fix] - **New Occurrence**: [After fix] - **New Detection**: [After fix] - **New RPN**: [Verify improvement] ``` **Example - AI Loop State**: ```markdown ## FMEA: AI Loop State **Failure Mode**: Model enters infinite loop (repeats same action) **Function**: Model should progress toward task completion **Effect**: User timeout, poor UX, compute waste **STOPPER Root Causes**: - Cause 1: No progress detection mechanism - Cause 2: Insufficient context to realize repetition - Cause 3: Tool output doesn't change state **Risk Scores**: - Severity: 8 (major UX impact) - Occurrence: 6 (happens monthly) - Detection: 7 (hard to catch pre-deployment) - RPN: 8 × 6 × 7 = 336 → HIGH PRIORITY **STOPPER Prevention**: - Practice what works: Implement loop detector (action history comparison) - Expand: Document loop patterns in knowledge base - Action: Add progress metrics to all tool calls - Owner: Safety team - Due: 2 weeks **Post-Implementation**: - New Occurrence: 2 (96% reduction) - New Detection: 3 (early warning system) - New RPN: 8 × 2 × 3 = 48 → LOW PRIORITY ✓ ``` **Sources**: - Quality-One FMEA: https://quality-one.com/fmea/ - ASQ FMEA: https://asq.org/quality-resources/fmea --- ### 7. Program Slicing → STOPPER "Expand" (Cognitive Load Reduction) **Convergent Principle**: Reduce scope to prevent cognitive overwhelm **Shared Design**: - Both limit focus to manageable chunks - Both reduce executive function demands - Both prevent analysis paralysis - Both enable deeper investigation within scope **Process Comparison**: | Program Slicing | STOPPER "Expand" | |----------------|------------------| | Identify relevant code affecting variable | Narrow focus to essential context | | Eliminate irrelevant code paths | Reduce cognitive load | | Analyze smaller, manageable slice | Deep investigation within scope | | Reduces complexity exponentially | Prevents executive function overwhelm | **Implementation**: ```python def slice_for_investigation(variable: str, statement: int) -> CodeSlice: """ Program slicing with STOPPER Expand (reduce cognitive load) """ # STOPPER "Expand": Determine minimal relevant scope relevant_lines = [] # Backward slice: What influences this variable? for line in reversed(range(statement)): if influences(line, variable): relevant_lines.append(line) # STOPPER principle: Ignore irrelevant code # Reduces cognitive load from N lines to M lines (M << N) return CodeSlice( lines=relevant_lines, complexity_reduction=len(all_lines) / len(relevant_lines) ) ``` **ADHD/Executive Function Benefit**: - Full codebase: 10,000 lines (cognitive overwhelm) - Program slice: 150 lines (manageable for ADHD) - Reduction: 98.5% less to track mentally This maps directly to STOPPER "Expand" goal of reducing cognitive load for ADHD developers/AI systems. **Sources**: - GeeksforGeeks: https://www.geeksforgeeks.org/software-engineering-debugging-approaches/ --- ### 8. Retrospectives → STOPPER "Expand" + DBT Dialectics **Convergent Principle**: Team learning reduces individual cognitive burden **Shared Design**: - Both distribute cognitive load across team - Both balance opposing viewpoints (dialectics) - Both build shared mental models - Both improve through iteration **Process Comparison**: | Retrospective | STOPPER/DBT | |--------------|-------------| | Regular team reflection (2 weeks) | Regular DBT skills practice | | "What went well?" + "What needs improvement?" | Dialectical thinking (both/and, not either/or) | | Action items with owners | Commitment to behavioral change | | Distributed learning | "Expand": Team wisdom exceeds individual | **DBT Dialectics Integration**: ```markdown ## Retrospective (DBT-Enhanced) ### Opening: Set the Stage - **Wise mind**: "We'll balance facts with team feelings" - **Non-judgmental stance**: "Observations, not evaluations" ### Dialectical Analysis: **Thesis**: What worked well this sprint? - [Success 1] - [Success 2] **Antithesis**: What needs improvement? - [Challenge 1] - [Challenge 2] **Synthesis**: How do we integrate both? - [Combined approach 1] - [Both/and solution 1] ### STOPPER "Expand" (Team Learning): - What patterns did we miss as individuals? - What did collective intelligence reveal? - How does this change our shared mental model? ### Action Items (Commitment): | Action | Owner | Type | Due | DBT Skill | |--------|-------|------|-----|-----------| | [Change] | [Name] | Process | [Date] | [Relevant DBT skill] | ``` **Cognitive Load Distribution**: - Individual postmortem: One person's limited working memory - Team retrospective: Distributed cognition across 5-7 people - Result: More complete understanding, less individual burden **STOPPER "Expand" Benefit**: Team collaboration expands cognitive capacity beyond individual limits—critical for ADHD/executive function support. **Sources**: - PagerDuty Retrospectives: https://retrospectives.pagerduty.com/ - PagerDuty vs. Postmortems: https://www.pagerduty.com/blog/postmortems-vs-retrospectives/ --- ## Summary Matrix | RCA Methodology | Primary eFIT Protocol | Convergent Principle | Evidence of Convergent Evolution | |----------------|----------------------|----------------------|----------------------------------| | **Five Whys** | STOPPER "Pull back" | Iterative deepening to roots | Both trace causality backward ~5 levels | | **Fishbone** | STOPPER "Observe"/"Take a step back" | Systematic categorization | Both use structured domains (6 Ms / 6 contexts) | | **Blameless Postmortems** | DBT Non-judgmental + STOPPER "Practice what works" | Systems thinking over blame | Both assume good intentions, focus on environment | | **Systematic Debugging** | STOPPER (full protocol) | Structured process reduces cognitive load | Both pause reactive urges, both emphasize context | | **Cause Elimination** | STOPPER "Pull back" | Hypothesis testing | Both use scientific method, both iterate to truth | | **FMEA** | STOPPER "Pull back"/"Practice what works" | Proactive prevention | Both analyze causes before effects, both learn from history | | **Program Slicing** | STOPPER "Expand" | Cognitive load reduction | Both reduce scope to manageable chunks | | **Retrospectives** | STOPPER "Expand" + DBT Dialectics | Team learning | Both distribute cognitive load, both use both/and thinking | --- ## Validation of Computational Homology Thesis **Thesis**: Same problems across different substrates require similar solutions. **Evidence from RCA Research**: 1. **Independent Development, Convergent Design** - Five Whys: Toyota 1950s (manufacturing) - STOPPER: 2024 (AI systems) - Both: Iterative deepening, resistance to superficial fixes, ~5 iterations to root 2. **Cross-Domain Pattern Replication** - Fishbone: 1960s quality control (6 Ms) - STOPPER "Observe": 2024 AI debugging (6 contexts) - Both: Structured categorization prevents tunnel vision 3. **Blameless Culture Convergence** - DBT non-judgmental stance: 1993 (clinical psychology) - Google SRE blameless postmortems: 2000s (software engineering) - Both: Systems focus, good intentions assumption, psychological safety 4. **Cognitive Load Reduction Pattern** - Program slicing: 1980s (computer science) - STOPPER "Expand": 2024 (ADHD support) - Both: Scope limitation, executive function support **Conclusion**: RCA methodologies demonstrate convergent evolution with eFIT protocols, validating that executive function requirements are universal across substrates (humans, organizations, AI systems). --- ## Implementation Priorities for eFIT ### Phase 1: Core RCA (Immediate) 1. **Five Whys + STOPPER "Pull back"** - Simplest, highest value - Directly maps to AI loop analysis 2. **Fishbone + STOPPER "Observe"** - AI-specific categories: Code, Config, Context, Capacity, Coordination, Cognition - Visual aid for team RCA sessions 3. **Blameless Postmortems + DBT** - Template adaptation (see above) - Cultural foundation for learning ### Phase 2: Advanced Integration (Near-term) 4. **Systematic Debugging Framework** - Full STOPPER protocol implementation - Step-by-step guide for AI incidents 5. **FMEA for AI Systems** - Proactive failure mode catalog - Risk prioritization (RPN calculation) 6. **Retrospectives + Dialectics** - Regular team learning cadence - DBT skills practice integration ### Phase 3: Automation (Long-term) 7. **Pattern Detection** - ML-based RCA automation - Historical incident analysis 8. **Knowledge Base Integration** - CortexGraph memory for RCA patterns - Auto-suggest relevant past incidents --- ## References ### Toyota/Manufacturing - Five Whys: https://www.orcalean.com/article/how-toyota-is-using-5-whys-method - Fishbone: https://goleansixsigma.com/fishbone-diagram/ ### Google SRE - Postmortem culture: https://sre.google/sre-book/postmortem-culture/ - Example template: https://sre.google/sre-book/example-postmortem/ ### Etsy Engineering - Blameless postmortems: https://www.etsy.com/codeascraft/blameless-postmortems ### Software Engineering - Systematic debugging: https://ntietz.com/blog/how-i-debug-2023/ - Debugging approaches: https://www.geeksforgeeks.org/software-engineering-debugging-approaches/ ### Quality/Safety - FMEA: https://quality-one.com/fmea/ - ASQ resources: https://asq.org/ ### Incident Management - PagerDuty retrospectives: https://retrospectives.pagerduty.com/ - PagerDuty postmortems: https://www.pagerduty.com/blog/postmortems-vs-retrospectives/ --- **Document status**: Complete **Last updated**: 2025-11-17 **Related documents**: - Full research: `RCA_METHODOLOGIES_RESEARCH.md` - Quick reference: `RCA_QUICK_REFERENCE.md`

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/prefrontalsys/mnemex'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

RCA_EFIT_MAPPING.md•22.1 KiB