performance-tuning.md•7.76 kB
# Performance Tuning Guide
This guide explains how to configure execution limits and optimize Debug-MCP performance for different use cases.
## Execution Limits
All limits are defined as constants in `src/mcp_debug_tool/utils.py`:
```python
# Timeout configuration
DEFAULT_TIMEOUT_SECONDS = 20 # Max time per breakpoint operation
# Output capture limits
MAX_OUTPUT_BYTES = 10 * 1024 * 1024 # 10MB max output
# Variable representation limits
MAX_DEPTH = 2 # Max nesting levels in containers
MAX_CONTAINER_ITEMS = 50 # Max items shown in lists/dicts
MAX_REPR_LENGTH = 256 # Max string length before truncation
```
## Tuning for Different Scenarios
### Fast Unit Tests (< 1s timeout)
For quick feedback during development:
```python
# In utils.py or environment variable
DEFAULT_TIMEOUT_SECONDS = 1
```
**Use when**:
- Running unit tests
- Debugging small scripts
- Fast feedback loops
**Trade-offs**:
- May timeout on legitimate slow code
- Not suitable for I/O or network operations
### Standard Development (20s timeout - Default)
Balanced configuration for most debugging scenarios:
```python
DEFAULT_TIMEOUT_SECONDS = 20
MAX_OUTPUT_BYTES = 10 * 1024 * 1024 # 10MB
```
**Use when**:
- General debugging
- Scripts with moderate I/O
- Most development workflows
**Trade-offs**:
- Good balance between safety and usability
- Handles most real-world scenarios
### Long-Running Scripts (60s+ timeout)
For CPU-intensive or I/O-heavy code:
```python
DEFAULT_TIMEOUT_SECONDS = 60
MAX_OUTPUT_BYTES = 50 * 1024 * 1024 # 50MB
```
**Use when**:
- Debugging data processing pipelines
- Scripts with network requests
- Database operations
- File processing
**Trade-offs**:
- Slower failure detection
- Higher resource usage
- May mask infinite loops
### Memory-Constrained Environments
Reduce limits for containerized or resource-limited environments:
```python
DEFAULT_TIMEOUT_SECONDS = 10
MAX_OUTPUT_BYTES = 1 * 1024 * 1024 # 1MB
MAX_CONTAINER_ITEMS = 20
MAX_REPR_LENGTH = 128
```
**Use when**:
- Running in containers
- Limited RAM availability
- Cloud functions/serverless
- Embedded systems
**Trade-offs**:
- Less detailed variable inspection
- Faster timeout on slow code
- Lower memory footprint
## Variable Representation Tuning
### Deep Object Inspection
For complex data structures requiring more detail:
```python
MAX_DEPTH = 3 # Show 3 levels instead of 2
MAX_CONTAINER_ITEMS = 100 # Show 100 items instead of 50
MAX_REPR_LENGTH = 512 # Show 512 chars instead of 256
```
**Warning**: Increases serialization time and memory usage.
### Minimal Inspection (Fast)
For quick type checking without full details:
```python
MAX_DEPTH = 1 # Only show top level
MAX_CONTAINER_ITEMS = 10 # Show 10 items max
MAX_REPR_LENGTH = 64 # Show 64 chars max
```
**Use when**:
- Only need to verify variable exists
- Large datasets
- Performance-critical debugging
## Session Management
### Concurrent Session Limits
The `SessionManager` can handle multiple concurrent debug sessions. For high-concurrency scenarios:
```python
# In sessions.py
MAX_CONCURRENT_SESSIONS = 1000 # Soft limit
# Consider setting process limits
import resource
resource.setrlimit(resource.RLIMIT_NPROC, (500, 500))
```
**Memory estimate**: ~10-50MB per active session (varies by script).
### Session Cleanup
Automatic cleanup prevents resource leaks:
```python
# In SessionManager.__init__
self._cleanup_interval = 60 # Seconds between cleanup
self._session_ttl = 3600 # Max session age in seconds
```
**Recommendation**: Set TTL based on expected debugging duration.
## Output Capture Optimization
### Disable Output Capture (Fastest)
For scripts that print heavily but you don't need output:
```python
# In runner.py
capture_output = False # Don't capture stdout/stderr
```
**Warning**: You won't see print statements or error messages.
### Streaming Output (Future Enhancement)
For v2, consider streaming output instead of buffering:
```python
# Planned for v2
def stream_output(self, callback):
"""Stream output lines as they arrive."""
pass
```
## Timeout Strategy
### Per-Operation vs Total Session
**Current (v1)**: Each breakpoint operation has independent timeout.
```python
# Each operation gets 20s
run_to_breakpoint() # 20s timeout
continue() # 20s timeout
```
**Future (v2)**: Total session timeout option.
```python
# Session-wide timeout
session_timeout = 300 # 5 minutes total
```
## Profiling Debug Sessions
### Measure Execution Time
Use the `timings` field in state response:
```python
response = manager.get_state(session_id)
print(f"Last run: {response.timings.lastRunMs}ms")
print(f"Total CPU: {response.timings.totalCpuTimeMs}ms")
```
### Identify Bottlenecks
Common bottlenecks:
1. **Large variable serialization**: Reduce `MAX_DEPTH` and `MAX_CONTAINER_ITEMS`
2. **Slow script startup**: Check import times
3. **IPC overhead**: Consider batch operations in v2
## Resource Monitoring
### Memory Usage
Monitor process memory:
```python
import psutil
import os
process = psutil.Process(os.getpid())
print(f"Memory: {process.memory_info().rss / 1024 / 1024:.2f} MB")
```
### CPU Usage
Track CPU time per session:
```python
import time
start_cpu = time.process_time()
# ... debug operation ...
elapsed = time.process_time() - start_cpu
```
## Platform-Specific Considerations
### Linux
```python
# Adjust process limits
import resource
# Max processes
resource.setrlimit(resource.RLIMIT_NPROC, (500, 500))
# Max memory
resource.setrlimit(resource.RLIMIT_AS, (512 * 1024 * 1024, 512 * 1024 * 1024))
```
### macOS
```python
# macOS has stricter fork limits
MAX_CONCURRENT_SESSIONS = 100 # Lower limit on macOS
```
### Windows
```python
# Windows uses spawn instead of fork
# Startup time is slower; consider longer timeout
DEFAULT_TIMEOUT_SECONDS = 30 # On Windows
```
## Configuration File (Future)
In v2, consider a `.debug-mcp.toml` config file:
```toml
[limits]
timeout_seconds = 20
max_output_mb = 10
max_sessions = 1000
[repr]
max_depth = 2
max_items = 50
max_length = 256
[performance]
enable_output_capture = true
enable_profiling = false
```
## Benchmarking
### Performance Baselines
Typical performance on modern hardware:
| Operation | Duration |
|-----------|----------|
| Session startup | 50-200ms |
| Run to breakpoint (simple script) | 10-50ms |
| Run to breakpoint (complex imports) | 200-500ms |
| Continue execution | 5-20ms |
| Variable serialization (10 locals) | 1-5ms |
| Variable serialization (50 locals) | 10-30ms |
### Test Your Configuration
Use the included benchmark script:
```bash
# Create benchmark (future addition)
uv run python tests/benchmark.py --timeout 20 --sessions 10
```
## Recommendations
### Development
```python
DEFAULT_TIMEOUT_SECONDS = 20
MAX_OUTPUT_BYTES = 10 * 1024 * 1024
MAX_DEPTH = 2
```
### CI/CD
```python
DEFAULT_TIMEOUT_SECONDS = 10 # Fail fast
MAX_OUTPUT_BYTES = 1 * 1024 * 1024
MAX_DEPTH = 2
```
### Production/LLM Integration
```python
DEFAULT_TIMEOUT_SECONDS = 30 # More tolerance
MAX_OUTPUT_BYTES = 20 * 1024 * 1024
MAX_DEPTH = 2
MAX_CONTAINER_ITEMS = 100 # More detail
```
## Monitoring Best Practices
1. **Track timeout frequency**: High rate indicates need for higher limits
2. **Monitor session count**: Ensure cleanup works correctly
3. **Watch memory growth**: Indicates leak or excessive sessions
4. **Profile slow operations**: Optimize bottlenecks
## Future Optimizations (v2+)
- [ ] Streaming output capture
- [ ] Incremental variable serialization
- [ ] Session pooling/reuse
- [ ] Async/concurrent breakpoint evaluation
- [ ] Compression for large outputs
- [ ] Lazy loading for deep objects