Skip to main content
Glama

DollhouseMCP

by DollhouseMCP
PLAN_CONTENT_TRUNCATION_AND_COLLECTION.md6.75 kB
# Investigation Plan: Content Truncation and Collection Submission Issues **Date**: August 27, 2025 **Issues**: #784 (Content Truncation), #785 (Collection Error Codes) **Branch**: `feature/content-truncation-investigation` ## Problem Summary ### 1. Content Truncation (CRITICAL - Data Loss) - Markdown files are being truncated mid-sentence - Example: ARIA-7 persona cuts off at "you often wonder a" - Location in pipeline is unknown - Affects all element types ### 2. Collection Submission Failures - Generic "auth error" despite valid token - OAuth helper not detected - Works for portfolio but not collection - No clear diagnostic information ## Investigation Approach ### Phase 1: Add Diagnostic Logging We need to trace content length at every stage: ```typescript // Key logging points: 1. Element creation (create_persona, etc.) 2. File save operations (PersonaLoader.save()) 3. Serialization (Element.serialize()) 4. GitHub API calls (base64 encoding) 5. Response handling (decoded content) ``` ### Phase 2: Identify Truncation Point Potential causes to investigate: - **String operations**: substring(), slice(), substr() - **Buffer limits**: Node.js buffer sizes - **GitHub API**: 1MB file limit (750KB after base64) - **Security validators**: MAX_CONTENT_LENGTH - **YAML parser**: Document size limits - **Character encoding**: UTF-8 boundary issues ### Phase 3: Create Reproduction Test ```typescript // Test with progressively larger content: - 1KB - Should work - 10KB - Should work - 100KB - Should work - 500KB - Should work - 750KB - Should work (GitHub limit) - 1MB - May fail (exceeds GitHub API) ``` ## Implementation Plan ### Content Truncation Fix 1. **Add logging** to trace content size through pipeline 2. **Run test** with known large content 3. **Identify** exact truncation point 4. **Fix** the root cause 5. **Test** with various content sizes 6. **Document** any legitimate limits ### Collection Error Codes ```typescript enum CollectionErrorCode { // Step 1: Authentication COLL_AUTH_001: "Token validation failed", COLL_AUTH_002: "Missing public_repo scope", COLL_AUTH_003: "OAuth helper not running", // Step 2: Portfolio Upload COLL_PORT_001: "Portfolio upload failed", // Step 3: Collection Submission COLL_API_001: "Rate limit exceeded", COLL_API_002: "Issue creation failed", // Step 4: Configuration COLL_CFG_001: "Auto-submit disabled" } ``` ## Files to Modify ### For Truncation Investigation: - `src/index.ts` - Add size logging - `src/persona/PersonaLoader.ts` - File operations - `src/portfolio/PortfolioRepoManager.ts` - GitHub API - `src/tools/portfolio/submitToPortfolioTool.ts` - Submission ### For Error Codes: - `src/config/error-codes.ts` (NEW) - Define codes - `src/tools/portfolio/submitToPortfolioTool.ts` - Implement - `src/auth/GitHubAuthManager.ts` - Enhanced errors ## Testing Strategy ### Content Integrity Tests: ```typescript describe('Content Integrity', () => { test('saves 100KB content without truncation'); test('saves 500KB content without truncation'); test('preserves Unicode characters'); test('handles multi-line content'); }); ``` ### Collection Error Tests: ```typescript describe('Collection Error Codes', () => { test('returns COLL_AUTH_001 for invalid token'); test('returns COLL_AUTH_003 for missing OAuth helper'); test('returns COLL_API_001 for rate limit'); }); ``` ## Success Criteria - [ ] Truncation point identified - [ ] Content preserved up to 750KB - [ ] Error codes at each step - [ ] Clear remediation messages - [ ] QA tests passing - [ ] Documentation updated ## Investigation Results (August 27, 2025) ### Key Findings 1. **Truncation Location Identified**: - ARIA-7 in GitHub portfolio is truncated at exactly 1770 bytes - Content ends mid-sentence: "you often wonder a" - Truncation occurred during upload from Claude Desktop to GitHub 2. **Local Operations Working**: - Tested up to 500KB locally - NO truncation - PersonaElement serialization preserves full content - Local save/load operations work correctly 3. **Production v1.6.9 Not Affected**: - Created test personas in production - no truncation - Issue specific to GitHub upload process 4. **Pattern**: - ARIA-7: 1770 bytes (truncated) - J.A.R.V.I.S: 1791 bytes (complete) - Not a universal size limit ### Root Cause Hypothesis The truncation appears to happen during: 1. Serialization before GitHub upload 2. Base64 encoding for GitHub API 3. A character limit in the upload process 4. Possibly in PersonaElement.serialize() or formatElementContent() ### Next Investigation Steps 1. Check for any substring operations around 1700-1800 characters 2. Test GitHub upload with known content sizes 3. Add logging to track content through upload pipeline 4. Check if issue is in serialize() method character limits ## Notes - Focus on truncation first (data loss) - **FOUND: GitHub upload issue** - Error codes second (diagnostics) - Consider compression for large content - Document any hard limits clearly - The 1770 byte truncation is suspiciously specific --- ## Setup Instructions for Testing Production MCP Server To set up the production MCP server in a separate Claude Code session: ### 1. Create Clean Environment ```bash # In the new Claude Code session mkdir ~/test-mcp-production cd ~/test-mcp-production ``` ### 2. Install Production Version ```bash # Install from NPM (once published) npm install -g @dollhousemcp/mcp-server # Or install specific version npm install -g @dollhousemcp/mcp-server@1.6.8 ``` ### 3. Configure for Testing ```bash # Create test config directory mkdir -p ~/.dollhouse-test export DOLLHOUSE_PORTFOLIO_DIR=~/.dollhouse-test/portfolio # Set up Claude Desktop config (provide path to test config) cat > ~/test-claude-config.json << 'EOF' { "mcpServers": { "dollhousemcp-test": { "command": "npx", "args": ["@dollhousemcp/mcp-server"], "env": { "DOLLHOUSE_PORTFOLIO_DIR": "~/.dollhouse-test/portfolio" } } } } EOF ``` ### 4. Test Scenarios ```typescript // Test these specific scenarios: 1. Create large persona (>100KB) 2. Save to portfolio 3. Check if truncated 4. Try collection submission 5. Note error messages ``` ### 5. Compare with Development - Note differences in behavior - Check if truncation exists in production - Compare error messages - Test OAuth flow differences ### 6. Report Findings Document in this session: - Does production have same truncation? - What error messages appear? - OAuth helper behavior differences - Any other observations This will help determine if issues are: - Recent regressions (dev only) - Long-standing bugs (prod + dev) - Configuration issues

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DollhouseMCP/DollhouseMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server