Terminally MCP

test-results-analysis.md•7.74 KiB

# Test Results Analysis - Terminally MCP Server ## Test Execution Summary - **Total Tests**: 37 - **Passed**: 26 (70%) - **Failed**: 11 (30%) - **Duration**: 82.5 seconds ## Critical Failures Identified ### 1. Output Parsing Issues (HIGH PRIORITY) #### Issue: Commands with no output **Test**: `should handle commands that produce no output` - **Expected**: Match `/no output|^$/` - **Actual**: Returns shell prompt and marker text - **Root Cause**: The server doesn't properly handle commands that produce no stdout, returning raw tmux pane content including prompts and markers. #### Issue: Exit code extraction **Test**: `should preserve command exit codes` - **Expected**: Clean exit code (e.g., "0") - **Actual**: Returns entire command output including prompts - **Root Cause**: The output parsing logic doesn't properly extract just the exit code value. #### Issue: Working directory output **Tests**: - `should maintain working directory per tab` - `should handle shell built-ins correctly` - **Expected**: Clean path (e.g., "/tmp") - **Actual**: Includes command echo and prompt (e.g., "➜ /tmp pwd\n/tmp") - **Root Cause**: The marker-based extraction includes the command echo, not just the output. ### 2. Command Execution Problems (HIGH PRIORITY) #### Issue: Long command handling **Test**: `should handle very long commands` - **Expected**: Output > 4900 characters - **Actual**: NaN (parsing failure) - **Root Cause**: Very long commands appear to break the tmux send-keys mechanism or the output parsing. #### Issue: Timeout handling **Test**: `should respect custom timeout values` - **Expected**: Duration > 2000ms for a 2-second sleep - **Actual**: 860ms - **Root Cause**: The timeout mechanism isn't waiting for command completion; it's returning prematurely. ### 3. Tab Management Issues (MEDIUM PRIORITY) #### Issue: Tab names with spaces **Test**: `should handle tab names with special characters` - **Expected**: "tab-with-spaces in name" - **Actual**: "tab-with-spaces" - **Root Cause**: Tmux window names don't properly handle spaces; they're being truncated at the first space. ### 4. Error Handling Gaps (HIGH PRIORITY) #### Issue: Operations on closed tabs **Test**: `should handle operations on recently closed tab` - **Expected**: Should throw an error - **Actual**: Returns error message as successful response - **Root Cause**: Error handling returns error text instead of throwing/rejecting. #### Issue: Missing required parameters **Test**: `should handle missing required parameters` - **Expected**: Should throw an error - **Actual**: Returns error message as successful response - **Root Cause**: Parameter validation doesn't properly reject invalid requests. ### 5. Performance Issues (MEDIUM PRIORITY) #### Issue: Timeouts on rapid operations **Tests**: - `should handle reading from tab with massive output` (10s timeout) - `should handle rapid sequential commands in same tab` (10s timeout) - **Root Cause**: The marker-based synchronization mechanism can't handle rapid sequential operations efficiently. ## Root Cause Analysis ### Primary Issues: 1. **Marker-Based Output Capture**: The current implementation using UUID markers has several flaws: - Includes shell prompts and command echoes - Doesn't properly isolate command output - Can timeout when markers aren't found quickly 2. **Error Handling**: Errors are being caught and returned as successful responses with error text, rather than properly propagating as MCP errors. 3. **Tmux Command Construction**: Issues with: - Window name handling (spaces truncated) - Long command handling (buffer limits) - Send-keys escaping 4. **Synchronization**: The sleep-based waiting mechanism is unreliable: - Fixed sleeps are either too short (missing output) or too long (performance issues) - No proper command completion detection ## Recommended Fixes ### Immediate (Critical): 1. **Fix Output Parsing**: ```typescript // In tmuxManager.ts executeCommand method // Better marker detection and output extraction const cleanOutput = (rawOutput: string, startMarker: string, endMarker: string) => { const lines = rawOutput.split('\n'); let inCommand = false; let output = []; for (const line of lines) { if (line.includes(startMarker)) { inCommand = true; continue; } if (line.includes(endMarker)) { break; } if (inCommand && !line.match(/^[➜$#]/)) { // Skip prompt lines output.push(line); } } return output.join('\n').trim() || '(no output)'; }; ``` 2. **Fix Error Propagation**: ```typescript // In handlers.ts async handle(args: { window_id: string, command: string }): Promise<{ output: string }> { try { const output = await this.tmuxManager.executeCommand(args.window_id, args.command); return { output }; } catch (error) { // Don't return error as success - throw it throw new McpError( ErrorCode.InternalError, `Failed to execute command: ${error.message}` ); } } ``` 3. **Fix Tab Name Handling**: ```typescript // In tmuxManager.ts createTab method const windowName = name ? name.replace(/\s+/g, '_') : `tab-${Date.now()}`; // Store original name mapping if needed ``` ### Short-term (Week 1): 1. **Implement Proper Command Completion Detection**: - Use tmux's `capture-pane -p -S -` with pattern matching - Implement exponential backoff for checking completion - Add command-specific timeout strategies 2. **Add Input Validation**: - Validate all required parameters before execution - Add length limits for commands and tab names - Sanitize special characters properly 3. **Improve Synchronization**: - Replace fixed sleeps with polling mechanisms - Implement proper async/await patterns - Add retry logic for transient failures ### Long-term (Week 2+): 1. **Replace Marker System**: - Consider using tmux's pipe-pane for real-time output capture - Implement a more robust output isolation mechanism - Add support for streaming output 2. **Add Connection Pooling**: - Reuse tmux sessions efficiently - Implement connection health checks - Add automatic reconnection logic 3. **Performance Optimization**: - Batch operations where possible - Implement caching for read operations - Add connection pooling for concurrent operations ## Test Suite Improvements ### Tests That Worked Well: - Unicode handling ✓ - Concurrent tab operations ✓ - Environment isolation ✓ - Background process handling ✓ - Shell compatibility (mostly) ✓ ### Additional Tests Needed: 1. **Stress Testing**: More concurrent operations with higher load 2. **Network Simulation**: Test with delays and packet loss 3. **Resource Limits**: Test with system resource constraints 4. **Security Testing**: Command injection with more sophisticated attacks 5. **Recovery Testing**: Server restart and reconnection scenarios ## Conclusion The comprehensive test suite has successfully identified critical issues that would cause failures in production. The 30% failure rate indicates significant problems with the current implementation, particularly around: 1. **Output parsing and marker detection** 2. **Error handling and propagation** 3. **Command synchronization and timing** 4. **Special character handling** These issues must be addressed before the server can be considered production-ready. The test suite itself is valuable and should be maintained as part of the CI/CD pipeline to prevent regression. ## Priority Action Items 1. **CRITICAL**: Fix output parsing to properly extract command output 2. **CRITICAL**: Fix error propagation to properly reject on failures 3. **HIGH**: Implement proper command completion detection 4. **HIGH**: Fix timeout mechanism to respect specified durations 5. **MEDIUM**: Handle special characters in tab names 6. **MEDIUM**: Optimize rapid sequential command execution

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/NightTrek/Terminally-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

test-results-analysis.md•7.74 KiB