claude-recall

Overview Schema Related Servers Score Discussions

progressive-disclosure.mdx•17.8 KiB

# Progressive Disclosure: claude-recall's Context Priming Philosophy ## Core Principle **Show what exists and its retrieval cost first. Let the agent decide what to fetch based on relevance and need.** --- ## What is Progressive Disclosure? Progressive disclosure is an information architecture pattern where you reveal complexity gradually rather than all at once. In the context of AI agents, it means: 1. **Layer 1 (Index)**: Show lightweight metadata (titles, dates, types, token counts) 2. **Layer 2 (Details)**: Fetch full content only when needed 3. **Layer 3 (Deep Dive)**: Read original source files if required This mirrors how humans work: We scan headlines before reading articles, review table of contents before diving into chapters, and check file names before opening files. --- ## The Problem: Context Pollution Traditional RAG (Retrieval-Augmented Generation) systems fetch everything upfront: ``` ❌ Traditional Approach: ┌─────────────────────────────────────┐ │ Session Start │ │ │ │ [15,000 tokens of past sessions] │ │ [8,000 tokens of observations] │ │ [12,000 tokens of file summaries] │ │ │ │ Total: 35,000 tokens │ │ Relevant: ~2,000 tokens (6%) │ └─────────────────────────────────────┘ ``` **Problems:** - Wastes 94% of attention budget on irrelevant context - User prompt gets buried under mountain of history - Agent must process everything before understanding task - No way to know what's actually useful until after reading --- ## claude-recall's Solution: Progressive Disclosure ``` ✅ Progressive Disclosure Approach: ┌─────────────────────────────────────┐ │ Session Start │ │ │ │ Index of 50 observations: ~800 tokens│ │ ↓ │ │ Agent sees: "🔴 Hook timeout issue" │ │ Agent decides: "Relevant!" │ │ ↓ │ │ Fetch observation #2543: ~120 tokens│ │ │ │ Total: 920 tokens │ │ Relevant: 920 tokens (100%) │ └─────────────────────────────────────┘ ``` **Benefits:** - Agent controls its own context consumption - Directly relevant to current task - Can fetch more if needed - Can skip everything if not relevant - Clear cost/benefit for each retrieval decision --- ## How It Works in claude-recall ### The Index Format Every SessionStart hook provides a compact index: ```markdown ### Oct 26, 2025 **General** | ID | Time | T | Title | Tokens | |----|------|---|-------|--------| | #2586 | 12:58 AM | 🔵 | Context hook file exists but is empty | ~51 | | #2587 | ″ | 🔵 | Context hook script file is empty | ~46 | | #2589 | ″ | 🟡 | Investigated hook debug output docs | ~105 | **src/hooks/context-hook.ts** | ID | Time | T | Title | Tokens | |----|------|---|-------|--------| | #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 | | #2592 | 1:16 AM | ⚖️ | Web UI strategy redesigned | ~193 | ``` **What the agent sees:** - **What exists**: Observation titles give semantic meaning - **When it happened**: Timestamps for temporal context - **What type**: Icons indicate observation category - **Retrieval cost**: Token counts for informed decisions - **Where to get it**: MCP search tools referenced at bottom ### The Legend System ``` 🎯 session-request - User's original goal 🔴 gotcha - Critical edge case or pitfall 🟡 problem-solution - Bug fix or workaround 🔵 how-it-works - Technical explanation 🟢 what-changed - Code/architecture change 🟣 discovery - Learning or insight 🟠 why-it-exists - Design rationale 🟤 decision - Architecture decision ⚖️ trade-off - Deliberate compromise ``` **Purpose:** - Visual scanning (humans and AI both benefit) - Semantic categorization - Priority signaling (🔴 gotchas are more critical) - Pattern recognition across sessions ### Progressive Disclosure Instructions The index includes usage guidance: ```markdown 💡 **Progressive Disclosure:** This index shows WHAT exists and retrieval COST. - Use MCP search tools to fetch full observation details on-demand - Prefer searching observations over re-reading code for past decisions - Critical types (🔴 gotcha, 🟤 decision, ⚖️ trade-off) often worth fetching immediately ``` **What this does:** - Teaches the agent the pattern - Suggests when to fetch (critical types) - Recommends search over code re-reading (efficiency) - Makes the system self-documenting --- ## The Philosophy: Context as Currency ### Mental Model: Token Budget as Money Think of context window as a bank account: | Approach | Metaphor | Outcome | |----------|----------|---------| | **Dump everything** | Spending your entire paycheck on groceries you might need someday | Waste, clutter, can't afford what you actually need | | **Fetch nothing** | Refusing to spend any money | Starvation, can't accomplish tasks | | **Progressive disclosure** | Check your pantry, make a shopping list, buy only what you need | Efficiency, room for unexpected needs | ### The Attention Budget LLMs have finite attention: - Every token attends to every other token (n² relationships) - 100,000 token window ≠ 100,000 tokens of useful attention - Context "rot" happens as window fills - Later tokens get less attention than earlier ones **claude-recall's approach:** - Start with ~1,000 tokens of index - Agent has 99,000 tokens free for task - Agent fetches ~200 tokens when needed - Final budget: ~98,000 tokens for actual work ### Design for Autonomy > "As models improve, let them act intelligently" Progressive disclosure treats the agent as an **intelligent information forager**, not a passive recipient of pre-selected context. **Traditional RAG:** ``` System → [Decides relevance] → Agent ↑ Hope this helps! ``` **Progressive Disclosure:** ``` System → [Shows index] → Agent → [Decides relevance] → [Fetches details] ↑ You know best! ``` The agent knows: - The current task context - What information would help - How much budget to spend - When to stop searching We don't. --- ## Implementation Principles ### 1. Make Costs Visible Every item in the index shows token count: ``` | #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 | ^^^^ Retrieval cost ``` **Why:** - Agent can make informed ROI decisions - Small observations (~50 tokens) are "cheap" to fetch - Large observations (~500 tokens) require stronger justification - Matches how humans think about effort ### 2. Use Semantic Compression Titles compress full observations into ~10 words: **Bad title:** ``` Observation about a thing ``` **Good title:** ``` 🔴 Hook timeout issue: 60s default too short for npm install ``` **What makes a good title:** - Specific: Identifies exact issue - Actionable: Clear what to do - Self-contained: Doesn't require reading observation - Searchable: Contains key terms (hook, timeout, npm) - Categorized: Icon indicates type ### 3. Group by Context Observations are grouped by: - **Date**: Temporal context - **File path**: Spatial context (work on specific files) - **Project**: Logical context ```markdown **src/hooks/context-hook.ts** | ID | Time | T | Title | Tokens | |----|------|---|-------|--------| | #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 | | #2594 | 1:17 AM | 🟠 | Removed stderr section from docs | ~93 | ``` **Benefit:** If agent is working on `src/hooks/context-hook.ts`, related observations are already grouped together. ### 4. Provide Retrieval Tools The index is useless without retrieval mechanisms: ```markdown *Use claude-recall MCP search to access records with the given ID* ``` **Available MCP tools:** - `search` - Search memory index (Layer 1: Get IDs) - `timeline` - Get chronological context (Layer 2: See narrative arc) - `get_observations` - Fetch full details (Layer 3: Deep dive) The 3-layer workflow ensures progressive disclosure: index → context → details. --- ## Real-World Example ### Scenario: Agent asked to fix a bug in hooks **Without progressive disclosure:** ``` SessionStart injects 25,000 tokens of past context Agent reads everything Agent finds 1 relevant observation (buried in middle) Total tokens consumed: 25,000 Relevant tokens: ~200 Efficiency: 0.8% ``` **With progressive disclosure:** ``` SessionStart shows index: ~800 tokens Agent sees title: "🔴 Hook timeout issue: 60s too short" Agent thinks: "This looks relevant to my bug!" Agent fetches observation #2543: ~155 tokens Total tokens consumed: 955 Relevant tokens: 955 Efficiency: 100% ``` ### The Index Entry ```markdown | #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 | ``` **What the agent learns WITHOUT fetching:** - There's a known gotcha (🔴) about hook timeouts - It's related to npm install taking too long - Full details are ~155 tokens (cheap) - Happened at 2:14 PM (recent) **Decision tree:** ``` Is my task related to hooks? → YES Is my task related to timeouts? → YES Is my task related to npm? → YES 155 tokens is cheap → FETCH IT ``` --- ## The Three-Layer Workflow claude-recall implements progressive disclosure through a 3-layer workflow pattern: ### Layer 1: Search (Index) Start by searching to get a compact index with IDs: ```typescript search({ query: "hook timeout", limit: 10 }) ``` **Returns:** ``` Found 3 observations matching "hook timeout": | ID | Date | Type | Title | |----|------|------|-------| | #2543 | Oct 26 | gotcha | Hook timeout: 60s too short | | #2891 | Oct 25 | how-it-works | Hook timeout configuration | | #2102 | Oct 20 | problem-solution | Fixed timeout in CI | ``` **Cost:** ~50-100 tokens per result **Value:** Agent can scan and decide which observations are relevant ### Layer 2: Timeline (Context) Get chronological context around interesting observations: ```typescript timeline({ anchor: 2543, // Observation ID from search depth_before: 3, depth_after: 3 }) ``` **Returns:** Chronological view showing what happened before/during/after observation #2543 **Cost:** Variable based on depth **Value:** Understand narrative arc and context ### Layer 3: Get Observations (Details) Fetch full details only for relevant observations: ```typescript get_observations({ ids: [2543, 2102] // Selected from search results }) ``` **Returns:** ``` #2543 🔴 Hook timeout: 60s too short for npm install ───────────────────────────────────────────────── Date: Oct 26, 2025 2:14 PM Type: gotcha Project: claude-recall Narrative: Discovered that the default 60-second hook timeout is insufficient for npm install operations, especially with large dependency trees or slow network conditions. This causes SessionStart hook to fail silently, preventing context injection. Facts: - Default timeout: 60 seconds - npm install with cold cache: ~90 seconds - Configured timeout: 120 seconds in extension/lifecycle/lifecycle.json:25 Files Modified: - extension/lifecycle/lifecycle.json Concepts: hooks, timeout, npm, configuration ``` **Cost:** ~155 tokens for full details **Value:** Complete understanding of the issue --- ## Cognitive Load Theory Progressive disclosure is grounded in **Cognitive Load Theory**: ### Intrinsic Load The inherent difficulty of the task itself. **Example:** "Fix authentication bug" - Must understand auth system - Must understand the bug - Must write the fix This load is unavoidable. ### Extraneous Load The cognitive burden of poorly presented information. **Traditional RAG adds extraneous load:** - Scanning irrelevant observations - Filtering out noise - Remembering what to ignore - Re-contextualizing after each section **Progressive disclosure minimizes extraneous load:** - Scan titles (low effort) - Fetch only relevant (targeted effort) - Full attention on current task ### Germane Load The effort of building mental models and schemas. **Progressive disclosure supports germane load:** - Consistent structure (legend, grouping) - Clear categorization (types, icons) - Semantic compression (good titles) - Explicit costs (token counts) --- ## Anti-Patterns to Avoid ### ❌ Verbose Titles **Bad:** ``` | #2543 | 2:14 PM | 🔴 | Investigation into the issue where hooks time out | ~155 | ``` **Good:** ``` | #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 | ``` ### ❌ Hiding Costs **Bad:** ``` | #2543 | 2:14 PM | 🔴 | Hook timeout issue | ``` **Good:** ``` | #2543 | 2:14 PM | 🔴 | Hook timeout issue | ~155 | ``` ### ❌ No Retrieval Path **Bad:** ``` Here are 10 observations. [No instructions on how to get full details] ``` **Good:** ``` Here are 10 observations. *Use MCP search tools to fetch full observation details on-demand* ``` ### ❌ Skipping the Index Layer **Bad:** ```typescript // Fetching full details immediately get_observations({ ids: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] // Guessing which are relevant }) ``` **Good:** ```typescript // Follow the 3-layer workflow // Layer 1: Search for index search({ query: "hooks", limit: 20 }) // Layer 2: Review index, identify 2-3 relevant IDs // Layer 3: Fetch only relevant observations get_observations({ ids: [2543, 2891] // Just the most relevant }) ``` --- ## Key Design Decisions ### Why Token Counts? **Decision:** Show approximate token counts (~155, ~203) rather than exact counts. **Rationale:** - Communicates scale (50 vs 500) without false precision - Maps to human intuition (small/medium/large) - Allows agent to budget attention - Encourages cost-conscious retrieval ### Why Icons Instead of Text Labels? **Decision:** Use emoji icons (🔴, 🟡, 🔵) rather than text (GOTCHA, PROBLEM, HOWTO). **Rationale:** - Visual scanning (pattern recognition) - Token efficient (1 char vs 10 chars) - Language-agnostic - Aesthetically distinct - Works for both humans and AI ### Why Index-First, Not Smart Pre-Fetch? **Decision:** Always show index first, even if we "know" what's relevant. **Rationale:** - We can't know what's relevant better than the agent - Pre-fetching assumes we understand the task - Agent knows current context, we don't - Respects agent autonomy - Fails gracefully (can always fetch more) ### Why Group by File Path? **Decision:** Group observations by file path in addition to date. **Rationale:** - Spatial locality: Work on file X likely needs context about file X - Reduces scanning effort - Matches how developers think - Clear semantic boundaries --- ## Measuring Success Progressive disclosure is working when: ### ✅ Low Waste Ratio ``` Relevant Tokens / Total Context Tokens > 80% ``` Most of the context consumed is actually useful. ### ✅ Selective Fetching ``` Index Shown: 50 observations Details Fetched: 2-3 observations ``` Agent is being selective, not fetching everything. ### ✅ Fast Task Completion ``` Session with index: 30 seconds to find relevant context Session without: 90 seconds scanning all context ``` Time-to-relevant-information is faster. ### ✅ Appropriate Depth ``` Simple task: Only index needed Medium task: 1-2 observations fetched Complex task: 5-10 observations + code reads ``` Depth scales with task complexity. --- ## Future Enhancements ### Adaptive Index Size ```typescript // Vary index size based on session type SessionStart({ source: "startup" }): → Show last 10 sessions (small index) SessionStart({ source: "resume" }): → Show only current session (micro index) SessionStart({ source: "compact" }): → Show last 20 sessions (larger index) ``` ### Relevance Scoring ```typescript // Use embeddings to pre-sort index by relevance search({ query: "authentication bug", orderBy: "relevance" // Based on semantic similarity (future enhancement) }) ``` ### Cost Forecasting ```markdown 💡 **Budget Estimate:** - Fetching all 🔴 gotchas: ~450 tokens - Fetching all file-related: ~1,200 tokens - Fetching everything: ~8,500 tokens ``` ### Progressive Detail Levels ``` Layer 1: Index (titles only) Layer 2: Summaries (2-3 sentences) Layer 3: Full details (complete observation) Layer 4: Source files (referenced code) ``` --- ## Key Takeaways 1. **Show, don't tell**: Index reveals what exists without forcing consumption 2. **Cost-conscious**: Make retrieval costs visible for informed decisions 3. **Agent autonomy**: Let the agent decide what's relevant 4. **Semantic compression**: Good titles make or break the system 5. **Consistent structure**: Patterns reduce cognitive load 6. **Two-tier everything**: Index first, details on-demand 7. **Context as currency**: Spend wisely on high-value information --- ## Remember > "The best interface is one that disappears when not needed, and appears exactly when it is." Progressive disclosure respects the agent's intelligence and autonomy. We provide the map; the agent chooses the path. --- ## Further Reading - [Context Engineering for AI Agents](context-engineering) - Foundational principles - [claude-recall Architecture](architecture/overview) - How it all fits together - Cognitive Load Theory (Sweller, 1988) - Information Foraging Theory (Pirolli & Card, 1999) - Progressive Disclosure (Nielsen Norman Group) --- *This philosophy emerged from real-world usage of claude-recall across hundreds of coding sessions. The pattern works because it aligns with both human cognition and LLM attention mechanics.*

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/nhevers/claude-recall'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

progressive-disclosure.mdx•17.8 KiB