# Mental Models - Behavioral Tests
Workflows for verifying the `thoughtbox_gateway` mental_models sub-operation.
**Tool:** `thoughtbox_gateway`
**Operation:** `mental_models` (with sub-operations via `args.operation`)
**Required stage:** Stage 2 (cipher_loaded)
**Sub-operations:** `list_tags`, `list_models`, `get_model`, `get_capability_graph`
## Test 1: Discovery Flow
**Goal:** Verify an agent can discover what's available.
**Steps:**
1. Call `mental_models` with operation `list_tags`
2. Verify response contains 9 tags with descriptions
3. Pick a tag (e.g., "debugging") and call `list_models` with that tag filter
4. Verify only models with that tag are returned
**Expected:** Agent can navigate from tags → filtered models
---
## Test 2: Model Retrieval Flow
**Goal:** Verify an agent can retrieve and use a mental model.
**Steps:**
1. Call `mental_models` with operation `get_model`, model `five-whys`
2. Verify response contains:
- Name and title
- Tags array
- Content with "# Five Whys" heading
- "## When to Use" section
- "## Process" section with numbered steps
3. Content should be process scaffolding (HOW to think), not analysis
**Expected:** Full prompt content suitable for guiding reasoning
---
## Test 3: Error Handling Flow
**Goal:** Verify graceful error handling.
**Steps:**
1. Call `get_model` without a model name - should error with available models list
2. Call `get_model` with invalid model name - should error with available models list
3. Call `list_models` with invalid tag - should error with available tags list
4. Call unknown operation - should error with available operations list
**Expected:** All errors include guidance on valid options
---
## Test 4: Capability Graph Flow
**Goal:** Verify capability graph can initialize knowledge graph.
**Steps:**
1. Call `mental_models` with operation `get_capability_graph`
2. Verify response contains:
- `entities` array with thoughtbox_server, tools, tags, and models
- `relations` array with provides, contains, tagged_with relationships
- `usage` object with step-by-step instructions
3. Optionally: Use returned data with `memory_create_entities` and `memory_create_relations`
**Expected:** Structured data ready for knowledge graph initialization
---
## Test 5: Tag Coverage Flow
**Goal:** Verify tag taxonomy covers use cases.
**Steps:**
1. Call `list_tags` to see all categories
2. For each tag, call `list_models` with that tag
3. Verify each tag has at least one model
4. Verify model descriptions match tag intent
**Expected:** Complete coverage - no orphan tags or miscategorized models
---
## Test 6: Content Quality Flow
**Goal:** Verify mental model content follows "infrastructure not intelligence" principle.
**Steps:**
1. Retrieve several models (rubber-duck, pre-mortem, inversion)
2. For each, verify content:
- Has clear process steps (numbered or bulleted)
- Explains WHEN to use
- Provides examples of APPLICATION
- Lists anti-patterns or common mistakes
- Does NOT perform reasoning or draw conclusions
**Expected:** Process scaffolds, not analysis
---
## Running These Tests
Execute by calling `thoughtbox_gateway` with `operation: "mental_models"` and sub-operation in `args.operation`. Requires Stage 2 (init + cipher).
**Verification checklist:** When running Test 1, verify the actual model count and tag count against the catalog. The test references 9 tags — confirm this matches the current implementation. If the numbers differ, update this test file with the correct counts.
Report any failures with:
- Operation called
- Arguments provided
- Expected vs actual response
- Specific assertion that failed