# Progressive Disclosure: claude-recall's Context Priming Philosophy
## Core Principle
**Show what exists and its retrieval cost first. Let the agent decide what to fetch based on relevance and need.**
---
## What is Progressive Disclosure?
Progressive disclosure is an information architecture pattern where you reveal complexity gradually rather than all at once. In the context of AI agents, it means:
1. **Layer 1 (Index)**: Show lightweight metadata (titles, dates, types, token counts)
2. **Layer 2 (Details)**: Fetch full content only when needed
3. **Layer 3 (Deep Dive)**: Read original source files if required
This mirrors how humans work: We scan headlines before reading articles, review table of contents before diving into chapters, and check file names before opening files.
---
## The Problem: Context Pollution
Traditional RAG (Retrieval-Augmented Generation) systems fetch everything upfront:
```
❌ Traditional Approach:
┌─────────────────────────────────────┐
│ Session Start │
│ │
│ [15,000 tokens of past sessions] │
│ [8,000 tokens of observations] │
│ [12,000 tokens of file summaries] │
│ │
│ Total: 35,000 tokens │
│ Relevant: ~2,000 tokens (6%) │
└─────────────────────────────────────┘
```
**Problems:**
- Wastes 94% of attention budget on irrelevant context
- User prompt gets buried under mountain of history
- Agent must process everything before understanding task
- No way to know what's actually useful until after reading
---
## claude-recall's Solution: Progressive Disclosure
```
✅ Progressive Disclosure Approach:
┌─────────────────────────────────────┐
│ Session Start │
│ │
│ Index of 50 observations: ~800 tokens│
│ ↓ │
│ Agent sees: "🔴 Hook timeout issue" │
│ Agent decides: "Relevant!" │
│ ↓ │
│ Fetch observation #2543: ~120 tokens│
│ │
│ Total: 920 tokens │
│ Relevant: 920 tokens (100%) │
└─────────────────────────────────────┘
```
**Benefits:**
- Agent controls its own context consumption
- Directly relevant to current task
- Can fetch more if needed
- Can skip everything if not relevant
- Clear cost/benefit for each retrieval decision
---
## How It Works in claude-recall
### The Index Format
Every SessionStart hook provides a compact index:
```markdown
### Oct 26, 2025
**General**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2586 | 12:58 AM | 🔵 | Context hook file exists but is empty | ~51 |
| #2587 | ″ | 🔵 | Context hook script file is empty | ~46 |
| #2589 | ″ | 🟡 | Investigated hook debug output docs | ~105 |
**src/hooks/context-hook.ts**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
| #2592 | 1:16 AM | ⚖️ | Web UI strategy redesigned | ~193 |
```
**What the agent sees:**
- **What exists**: Observation titles give semantic meaning
- **When it happened**: Timestamps for temporal context
- **What type**: Icons indicate observation category
- **Retrieval cost**: Token counts for informed decisions
- **Where to get it**: MCP search tools referenced at bottom
### The Legend System
```
🎯 session-request - User's original goal
🔴 gotcha - Critical edge case or pitfall
🟡 problem-solution - Bug fix or workaround
🔵 how-it-works - Technical explanation
🟢 what-changed - Code/architecture change
🟣 discovery - Learning or insight
🟠 why-it-exists - Design rationale
🟤 decision - Architecture decision
⚖️ trade-off - Deliberate compromise
```
**Purpose:**
- Visual scanning (humans and AI both benefit)
- Semantic categorization
- Priority signaling (🔴 gotchas are more critical)
- Pattern recognition across sessions
### Progressive Disclosure Instructions
The index includes usage guidance:
```markdown
💡 **Progressive Disclosure:** This index shows WHAT exists and retrieval COST.
- Use MCP search tools to fetch full observation details on-demand
- Prefer searching observations over re-reading code for past decisions
- Critical types (🔴 gotcha, 🟤 decision, ⚖️ trade-off) often worth fetching immediately
```
**What this does:**
- Teaches the agent the pattern
- Suggests when to fetch (critical types)
- Recommends search over code re-reading (efficiency)
- Makes the system self-documenting
---
## The Philosophy: Context as Currency
### Mental Model: Token Budget as Money
Think of context window as a bank account:
| Approach | Metaphor | Outcome |
|----------|----------|---------|
| **Dump everything** | Spending your entire paycheck on groceries you might need someday | Waste, clutter, can't afford what you actually need |
| **Fetch nothing** | Refusing to spend any money | Starvation, can't accomplish tasks |
| **Progressive disclosure** | Check your pantry, make a shopping list, buy only what you need | Efficiency, room for unexpected needs |
### The Attention Budget
LLMs have finite attention:
- Every token attends to every other token (n² relationships)
- 100,000 token window ≠ 100,000 tokens of useful attention
- Context "rot" happens as window fills
- Later tokens get less attention than earlier ones
**claude-recall's approach:**
- Start with ~1,000 tokens of index
- Agent has 99,000 tokens free for task
- Agent fetches ~200 tokens when needed
- Final budget: ~98,000 tokens for actual work
### Design for Autonomy
> "As models improve, let them act intelligently"
Progressive disclosure treats the agent as an **intelligent information forager**, not a passive recipient of pre-selected context.
**Traditional RAG:**
```
System → [Decides relevance] → Agent
↑
Hope this helps!
```
**Progressive Disclosure:**
```
System → [Shows index] → Agent → [Decides relevance] → [Fetches details]
↑
You know best!
```
The agent knows:
- The current task context
- What information would help
- How much budget to spend
- When to stop searching
We don't.
---
## Implementation Principles
### 1. Make Costs Visible
Every item in the index shows token count:
```
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
^^^^
Retrieval cost
```
**Why:**
- Agent can make informed ROI decisions
- Small observations (~50 tokens) are "cheap" to fetch
- Large observations (~500 tokens) require stronger justification
- Matches how humans think about effort
### 2. Use Semantic Compression
Titles compress full observations into ~10 words:
**Bad title:**
```
Observation about a thing
```
**Good title:**
```
🔴 Hook timeout issue: 60s default too short for npm install
```
**What makes a good title:**
- Specific: Identifies exact issue
- Actionable: Clear what to do
- Self-contained: Doesn't require reading observation
- Searchable: Contains key terms (hook, timeout, npm)
- Categorized: Icon indicates type
### 3. Group by Context
Observations are grouped by:
- **Date**: Temporal context
- **File path**: Spatial context (work on specific files)
- **Project**: Logical context
```markdown
**src/hooks/context-hook.ts**
| ID | Time | T | Title | Tokens |
|----|------|---|-------|--------|
| #2591 | 1:15 AM | ⚖️ | Stderr messaging abandoned | ~155 |
| #2594 | 1:17 AM | 🟠 | Removed stderr section from docs | ~93 |
```
**Benefit:** If agent is working on `src/hooks/context-hook.ts`, related observations are already grouped together.
### 4. Provide Retrieval Tools
The index is useless without retrieval mechanisms:
```markdown
*Use claude-recall MCP search to access records with the given ID*
```
**Available MCP tools:**
- `search` - Search memory index (Layer 1: Get IDs)
- `timeline` - Get chronological context (Layer 2: See narrative arc)
- `get_observations` - Fetch full details (Layer 3: Deep dive)
The 3-layer workflow ensures progressive disclosure: index → context → details.
---
## Real-World Example
### Scenario: Agent asked to fix a bug in hooks
**Without progressive disclosure:**
```
SessionStart injects 25,000 tokens of past context
Agent reads everything
Agent finds 1 relevant observation (buried in middle)
Total tokens consumed: 25,000
Relevant tokens: ~200
Efficiency: 0.8%
```
**With progressive disclosure:**
```
SessionStart shows index: ~800 tokens
Agent sees title: "🔴 Hook timeout issue: 60s too short"
Agent thinks: "This looks relevant to my bug!"
Agent fetches observation #2543: ~155 tokens
Total tokens consumed: 955
Relevant tokens: 955
Efficiency: 100%
```
### The Index Entry
```markdown
| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
```
**What the agent learns WITHOUT fetching:**
- There's a known gotcha (🔴) about hook timeouts
- It's related to npm install taking too long
- Full details are ~155 tokens (cheap)
- Happened at 2:14 PM (recent)
**Decision tree:**
```
Is my task related to hooks? → YES
Is my task related to timeouts? → YES
Is my task related to npm? → YES
155 tokens is cheap → FETCH IT
```
---
## The Three-Layer Workflow
claude-recall implements progressive disclosure through a 3-layer workflow pattern:
### Layer 1: Search (Index)
Start by searching to get a compact index with IDs:
```typescript
search({
query: "hook timeout",
limit: 10
})
```
**Returns:**
```
Found 3 observations matching "hook timeout":
| ID | Date | Type | Title |
|----|------|------|-------|
| #2543 | Oct 26 | gotcha | Hook timeout: 60s too short |
| #2891 | Oct 25 | how-it-works | Hook timeout configuration |
| #2102 | Oct 20 | problem-solution | Fixed timeout in CI |
```
**Cost:** ~50-100 tokens per result
**Value:** Agent can scan and decide which observations are relevant
### Layer 2: Timeline (Context)
Get chronological context around interesting observations:
```typescript
timeline({
anchor: 2543, // Observation ID from search
depth_before: 3,
depth_after: 3
})
```
**Returns:** Chronological view showing what happened before/during/after observation #2543
**Cost:** Variable based on depth
**Value:** Understand narrative arc and context
### Layer 3: Get Observations (Details)
Fetch full details only for relevant observations:
```typescript
get_observations({
ids: [2543, 2102] // Selected from search results
})
```
**Returns:**
```
#2543 🔴 Hook timeout: 60s too short for npm install
─────────────────────────────────────────────────
Date: Oct 26, 2025 2:14 PM
Type: gotcha
Project: claude-recall
Narrative:
Discovered that the default 60-second hook timeout is insufficient
for npm install operations, especially with large dependency trees
or slow network conditions. This causes SessionStart hook to fail
silently, preventing context injection.
Facts:
- Default timeout: 60 seconds
- npm install with cold cache: ~90 seconds
- Configured timeout: 120 seconds in extension/lifecycle/lifecycle.json:25
Files Modified:
- extension/lifecycle/lifecycle.json
Concepts: hooks, timeout, npm, configuration
```
**Cost:** ~155 tokens for full details
**Value:** Complete understanding of the issue
---
## Cognitive Load Theory
Progressive disclosure is grounded in **Cognitive Load Theory**:
### Intrinsic Load
The inherent difficulty of the task itself.
**Example:** "Fix authentication bug"
- Must understand auth system
- Must understand the bug
- Must write the fix
This load is unavoidable.
### Extraneous Load
The cognitive burden of poorly presented information.
**Traditional RAG adds extraneous load:**
- Scanning irrelevant observations
- Filtering out noise
- Remembering what to ignore
- Re-contextualizing after each section
**Progressive disclosure minimizes extraneous load:**
- Scan titles (low effort)
- Fetch only relevant (targeted effort)
- Full attention on current task
### Germane Load
The effort of building mental models and schemas.
**Progressive disclosure supports germane load:**
- Consistent structure (legend, grouping)
- Clear categorization (types, icons)
- Semantic compression (good titles)
- Explicit costs (token counts)
---
## Anti-Patterns to Avoid
### ❌ Verbose Titles
**Bad:**
```
| #2543 | 2:14 PM | 🔴 | Investigation into the issue where hooks time out | ~155 |
```
**Good:**
```
| #2543 | 2:14 PM | 🔴 | Hook timeout: 60s too short for npm install | ~155 |
```
### ❌ Hiding Costs
**Bad:**
```
| #2543 | 2:14 PM | 🔴 | Hook timeout issue |
```
**Good:**
```
| #2543 | 2:14 PM | 🔴 | Hook timeout issue | ~155 |
```
### ❌ No Retrieval Path
**Bad:**
```
Here are 10 observations. [No instructions on how to get full details]
```
**Good:**
```
Here are 10 observations.
*Use MCP search tools to fetch full observation details on-demand*
```
### ❌ Skipping the Index Layer
**Bad:**
```typescript
// Fetching full details immediately
get_observations({
ids: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] // Guessing which are relevant
})
```
**Good:**
```typescript
// Follow the 3-layer workflow
// Layer 1: Search for index
search({
query: "hooks",
limit: 20
})
// Layer 2: Review index, identify 2-3 relevant IDs
// Layer 3: Fetch only relevant observations
get_observations({
ids: [2543, 2891] // Just the most relevant
})
```
---
## Key Design Decisions
### Why Token Counts?
**Decision:** Show approximate token counts (~155, ~203) rather than exact counts.
**Rationale:**
- Communicates scale (50 vs 500) without false precision
- Maps to human intuition (small/medium/large)
- Allows agent to budget attention
- Encourages cost-conscious retrieval
### Why Icons Instead of Text Labels?
**Decision:** Use emoji icons (🔴, 🟡, 🔵) rather than text (GOTCHA, PROBLEM, HOWTO).
**Rationale:**
- Visual scanning (pattern recognition)
- Token efficient (1 char vs 10 chars)
- Language-agnostic
- Aesthetically distinct
- Works for both humans and AI
### Why Index-First, Not Smart Pre-Fetch?
**Decision:** Always show index first, even if we "know" what's relevant.
**Rationale:**
- We can't know what's relevant better than the agent
- Pre-fetching assumes we understand the task
- Agent knows current context, we don't
- Respects agent autonomy
- Fails gracefully (can always fetch more)
### Why Group by File Path?
**Decision:** Group observations by file path in addition to date.
**Rationale:**
- Spatial locality: Work on file X likely needs context about file X
- Reduces scanning effort
- Matches how developers think
- Clear semantic boundaries
---
## Measuring Success
Progressive disclosure is working when:
### ✅ Low Waste Ratio
```
Relevant Tokens / Total Context Tokens > 80%
```
Most of the context consumed is actually useful.
### ✅ Selective Fetching
```
Index Shown: 50 observations
Details Fetched: 2-3 observations
```
Agent is being selective, not fetching everything.
### ✅ Fast Task Completion
```
Session with index: 30 seconds to find relevant context
Session without: 90 seconds scanning all context
```
Time-to-relevant-information is faster.
### ✅ Appropriate Depth
```
Simple task: Only index needed
Medium task: 1-2 observations fetched
Complex task: 5-10 observations + code reads
```
Depth scales with task complexity.
---
## Future Enhancements
### Adaptive Index Size
```typescript
// Vary index size based on session type
SessionStart({ source: "startup" }):
→ Show last 10 sessions (small index)
SessionStart({ source: "resume" }):
→ Show only current session (micro index)
SessionStart({ source: "compact" }):
→ Show last 20 sessions (larger index)
```
### Relevance Scoring
```typescript
// Use embeddings to pre-sort index by relevance
search({
query: "authentication bug",
orderBy: "relevance" // Based on semantic similarity (future enhancement)
})
```
### Cost Forecasting
```markdown
💡 **Budget Estimate:**
- Fetching all 🔴 gotchas: ~450 tokens
- Fetching all file-related: ~1,200 tokens
- Fetching everything: ~8,500 tokens
```
### Progressive Detail Levels
```
Layer 1: Index (titles only)
Layer 2: Summaries (2-3 sentences)
Layer 3: Full details (complete observation)
Layer 4: Source files (referenced code)
```
---
## Key Takeaways
1. **Show, don't tell**: Index reveals what exists without forcing consumption
2. **Cost-conscious**: Make retrieval costs visible for informed decisions
3. **Agent autonomy**: Let the agent decide what's relevant
4. **Semantic compression**: Good titles make or break the system
5. **Consistent structure**: Patterns reduce cognitive load
6. **Two-tier everything**: Index first, details on-demand
7. **Context as currency**: Spend wisely on high-value information
---
## Remember
> "The best interface is one that disappears when not needed, and appears exactly when it is."
Progressive disclosure respects the agent's intelligence and autonomy. We provide the map; the agent chooses the path.
---
## Further Reading
- [Context Engineering for AI Agents](context-engineering) - Foundational principles
- [claude-recall Architecture](architecture/overview) - How it all fits together
- Cognitive Load Theory (Sweller, 1988)
- Information Foraging Theory (Pirolli & Card, 1999)
- Progressive Disclosure (Nielsen Norman Group)
---
*This philosophy emerged from real-world usage of claude-recall across hundreds of coding sessions. The pattern works because it aligns with both human cognition and LLM attention mechanics.*