We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/debugmcpdev/mcp-debugger'
If you have feedback or need assistance with the MCP directory API, please join our Discord server
TEST_RUN_SUMMARY.md•4.25 kB
# Smoke Test Run Summary
## Test Execution Results
**Date**: October 10, 2025
**Test Command**: `npx vitest run tests/e2e/mcp-server-smoke-javascript.test.ts tests/e2e/mcp-server-smoke-python.test.ts`
## Overall Results
- **Total Tests**: 9
- **Passed**: 5 (55.6%)
- **Failed**: 4 (44.4%)
- **Duration**: 26.53s
## JavaScript Adapter Results
### ✅ Passed Tests (3/4)
1. **Multiple breakpoints handling** - Successfully set multiple breakpoints
2. **Expression evaluation** - Correctly evaluated "1 + 2" expression
3. **Source context retrieval** - Successfully retrieved source context
### ❌ Failed Tests (1/4)
1. **Core debugging flow** - `startResponse.status` is undefined
- **Issue**: The start_debugging response doesn't include a status field
- **Impact**: Cannot verify debug session started correctly
## Python Adapter Results
### ✅ Passed Tests (2/5)
1. **Expression evaluation** - Correctly evaluated expressions and rejected statements
2. **Source context retrieval** - Successfully retrieved source context
### ❌ Failed Tests (3/5)
1. **Core debugging flow** - Breakpoint verification returns false
- **Expected**: Python breakpoints verify immediately
- **Actual**: `bpResponse.verified` is false
2. **Multiple breakpoints** - Same verification issue
- **Expected**: Both breakpoints verified as true
- **Actual**: Both return false
3. **Step into operation** - `stepIntoResult.status` is undefined
- **Issue**: The step_into response doesn't include a status field
- **Impact**: Cannot verify step operation completed
## Key Findings
### 🎯 Tests Are Working Correctly
The smoke tests successfully identified discrepancies between documented behavior and actual implementation. This proves the tests are valuable for catching regressions and behavior changes.
### 📊 Actual vs Expected Behaviors
| Feature | Expected (from test report) | Actual (from test run) |
|---------|----------------------------|------------------------|
| Python breakpoint verification | Immediate (true) | Returns false |
| JavaScript start_debugging status | Has status field | No status field |
| Python step_into status | Has status field | No status field |
### 🔍 Response Structure Issues
Several MCP tool responses are missing expected fields:
- `start_debugging` missing `status` field (JavaScript)
- `step_into` missing `status` field (Python)
- Breakpoint responses have `verified` field but returning unexpected values
## Recommendations
### Immediate Actions
1. **Update tests to match actual behavior** - Adjust expectations to match current implementation
2. **OR Fix adapter responses** - Add missing fields and correct verification logic
3. **Document actual response structures** - Update API documentation with real response formats
### Test Improvements
1. **Add response logging** - Log full responses for debugging
2. **Make assertions more flexible** - Check for success indicators beyond specific fields
3. **Add retry logic** - Some operations may need time to stabilize
## Positive Outcomes
✅ **Test infrastructure works** - MCP client connection and tool calling successful
✅ **Utilities function correctly** - `parseSdkToolResult` and `callToolSafely` work as designed
✅ **Some core features work** - Expression evaluation and source context retrieval function properly
✅ **Tests catch real issues** - Successfully identified missing/incorrect response fields
## Conclusion
The smoke tests are functioning as intended - they're catching discrepancies between expected and actual behavior. The failures are not test bugs but rather:
1. **Documentation vs Reality**: The test report documented idealized behavior
2. **Missing Response Fields**: Some adapter responses lack expected fields
3. **Behavior Differences**: Python breakpoints don't verify as expected
These tests provide valuable regression detection and should be:
1. Updated to match current behavior (pragmatic approach)
2. Used to drive adapter improvements (idealistic approach)
3. Run regularly to catch future changes
The fact that 5 out of 9 tests passed shows that core functionality is working, while the failures highlight areas needing attention or documentation updates.