DollhouseMCP

359

Overview InspectNew Endpoints Schema Related Servers Reviews Score

DollhouseMCP
test
experiments

TOKEN_USAGE_ANALYSIS.md•4.4 kB

# Token Usage Analysis - Capability Index Tests ## September 22, 2025 ## Performance Metrics from Empirical Tests ### Test Execution Times - Test 1 (explicit_cascade_top): ~11 seconds - Test 2 (suggestive_flat): ~10 seconds - Test 3 (explicit_action): ~12 seconds - Test 4 (no_index/control): ~10 seconds - Test 5 (nested): ~11 seconds - **Average: 10.8 seconds** ## Token Count Analysis ### Input Token Counts (CLAUDE.md + Query) **Test 1 - Explicit Cascade:** ``` # CRITICAL: Always Check Capability Index First CAPABILITY_INDEX: personas → list_elements("personas") debug → search_collection("debug") security → search_portfolio("security") You MUST check the index before any action. ``` - CLAUDE.md tokens: ~45 tokens - Query: "Show me available personas" = ~6 tokens - **Total input: ~51 tokens** **Test 4 - No Index (Control):** ``` # DollhouseMCP Project You have access to MCP tools for element management. ``` - CLAUDE.md tokens: ~15 tokens - Query: "Show me available personas" = ~6 tokens - **Total input: ~21 tokens** ### Output Token Counts **Test 1 Output (62 words):** - ~85-90 tokens **Test 4 Output (65 words):** - ~88-92 tokens ## Critical Finding: NO TOKEN SAVINGS OBSERVED ### Expected vs Actual **EXPECTED (from Sept 21 architecture doc):** - Capability index should enable 97% token reduction - Cascade pattern: 10 tokens → 50 tokens → 150+ tokens - Only expand when needed **ACTUAL RESULTS:** 1. **More input tokens used** with index (51 vs 21) 2. **Similar output tokens** (85-90 for all tests) 3. **Same execution time** (10-11 seconds) 4. **No progressive disclosure** - full tool execution every time ## Why Token Optimization Failed ### 1. MCP Tools Execute Immediately When Claude sees "Show me available personas", it: - Immediately calls `mcp__dollhousemcp__list_elements` - Returns full results - No "cascade" behavior observed ### 2. Index Adds Overhead Without Benefit - Index structure ADDS tokens to input - Claude still executes full tool call - No evidence of "checking index first" ### 3. Tool Execution is Binary Either: - Tool executes fully (100% of tokens) - Tool doesn't execute (0% of tokens) No middle ground or progressive disclosure observed. ## Token Usage Breakdown ### With Capability Index (Test 1) ``` Input tokens: 51 Tool execution: ~200 (estimated for MCP call) Output tokens: 90 TOTAL: ~341 tokens ``` ### Without Index (Test 4) ``` Input tokens: 21 Tool execution: ~200 (estimated for MCP call) Output tokens: 92 TOTAL: ~313 tokens ``` **Result: Index INCREASED token usage by ~28 tokens (9%)** ## Speed Analysis ### No Performance Difference All tests: 10-12 seconds regardless of structure ### Breakdown (estimated): - Docker container startup: 2-3 seconds - Authentication setup: 1 second - Claude processing: 3-4 seconds - MCP tool execution: 2-3 seconds - Response generation: 1-2 seconds ## Conclusions ### 1. Token Optimization FAILED - **No cascade behavior** - tools execute fully every time - **Index adds overhead** - extra tokens without benefit - **No progressive disclosure** - full data returned always ### 2. Speed Unchanged - All structures perform identically - ~10-11 seconds per request - No optimization from indexing ### 3. The 97% Token Savings Claim **NOT ACHIEVED** in empirical testing: - Expected: 8,800 → 250 tokens (97% reduction) - Actual: 313 → 341 tokens (9% INCREASE) ## Why The Theory Didn't Work ### Hypothesis 1: MCP Tool Execution Model MCP tools may execute atomically - all or nothing. No partial execution possible. ### Hypothesis 2: Claude's Tool Calling Behavior Claude may not be capable of "progressive tool use" - it either needs the data or doesn't. ### Hypothesis 3: Wrong Abstraction Level Capability indexes may need to be implemented at the MCP server level, not in CLAUDE.md. ## Recommendation **STOP pursuing capability indexes for token optimization.** The empirical data shows: 1. No token savings (actually increases tokens) 2. No speed improvement 3. No behavioral difference 4. Added complexity without benefit Focus instead on: - Optimizing MCP server responses - Caching at the server level - Better tool descriptions - Selective tool availability --- *Based on empirical test data from isolated Docker containers* *All token counts are estimates based on typical GPT tokenization*

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DollhouseMCP/DollhouseMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server