RECOVERY_PROCEDURES.md•8.01 kB
# EuConquisto Composer MCP - Recovery Procedures
**Version**: 1.0.0
**Date**: July 5, 2025
**Purpose**: Emergency recovery and rollback procedures
## Quick Reference
### Emergency Commands
```bash
# Full system rollback to v4.0.3
./scripts/rollback-to-v4.0.3.sh
# JWT token recovery only
./scripts/recover-jwt-token.sh
# System health check
./scripts/system-health-check.sh
# Create backup before changes
./scripts/backup-system.sh
```
### Emergency Contact
- **Migration Manager**: Claude Code
- **Documentation**: `/docs/euconquisto-migration/`
- **Backup Location**: `/backups/migration-safety/`
---
## Recovery Scenarios
### Scenario 1: Modular Infrastructure Failure
**Symptoms**: New infrastructure modules not working, errors in module loading
**Recovery Time**: < 5 minutes
#### Steps:
1. **Immediate Action**: Stop all processes to prevent corruption
2. **Execute Rollback**:
```bash
./scripts/rollback-to-v4.0.3.sh
```
3. **Verify System**: Run health check after rollback
4. **Test Functionality**: Create test composition to verify
#### Success Criteria:
- ✅ v4.0.3 system operational
- ✅ JWT server running on port 8080
- ✅ Authentication working
- ✅ Composition creation successful
---
### Scenario 2: JWT Token Corruption
**Symptoms**: Authentication failures, "JWT token not found" errors
**Recovery Time**: < 2 minutes
#### Steps:
1. **Diagnose Issue**: Check token file and length
```bash
ls -la /Users/ricardokawasaki/Desktop/euconquisto-composer-mcp-poc/archive/authentication/
wc -c /Users/ricardokawasaki/Desktop/euconquisto-composer-mcp-poc/archive/authentication/correct-jwt-new.txt
```
2. **Execute Recovery**:
```bash
./scripts/recover-jwt-token.sh
```
3. **Verify Recovery**: Check token length is 3276 characters
4. **Test Authentication**: Verify JWT server responds
#### Success Criteria:
- ✅ JWT token file restored
- ✅ Token length = 3276 characters
- ✅ JWT server responds to requests
- ✅ Authentication flow working
---
### Scenario 3: Configuration Corruption
**Symptoms**: npm errors, dependency issues, build failures
**Recovery Time**: < 10 minutes
#### Steps:
1. **Check Configuration**:
```bash
./scripts/system-health-check.sh
```
2. **Restore Configuration**: Use full rollback to restore package.json
```bash
./scripts/rollback-to-v4.0.3.sh
```
3. **Rebuild Dependencies**:
```bash
npm install
```
4. **Verify Build**: Check TypeScript compilation if needed
#### Success Criteria:
- ✅ package.json restored
- ✅ Dependencies installed successfully
- ✅ No npm errors
- ✅ System builds without errors
---
### Scenario 4: Complete System Failure
**Symptoms**: Nothing works, multiple components broken
**Recovery Time**: < 15 minutes
#### Steps:
1. **Full System Restore**:
```bash
./scripts/rollback-to-v4.0.3.sh --clean-modular
```
2. **Rebuild Environment**:
```bash
npm install
```
3. **Start Services**:
```bash
# JWT server will be started automatically by rollback script
# Verify with: lsof -i :8080
```
4. **Complete Health Check**:
```bash
./scripts/system-health-check.sh
```
5. **End-to-End Test**: Create a test composition
#### Success Criteria:
- ✅ All system components restored
- ✅ All health checks pass
- ✅ End-to-end workflow functional
- ✅ System ready for normal operation
---
## Backup Management
### Creating Backups
```bash
# Create comprehensive backup
./scripts/backup-system.sh
# Verify backup creation
ls -la backups/migration-safety/
```
### Backup Contents
- **Tier 1 Essential**: v4.0.3 system, JWT token, server files
- **Tier 2 Configuration**: Documentation, config files
- **Tier 3 Historical**: Previous versions, development tools
### Backup Verification
Each backup includes checksums for integrity verification:
```bash
# Check backup integrity
cd backups/migration-safety/backup_TIMESTAMP/
cat checksums/*.sha256
```
---
## Service Management
### JWT Redirect Server
```bash
# Check if running
lsof -i :8080
# Start manually
node tools/servers/jwt-redirect-server-v1.0.2.js &
# Stop if needed
kill $(lsof -ti :8080)
```
### Process Management
```bash
# Stop all MCP processes
pkill -f "mcp.*composer"
# Stop JWT server
kill $(lsof -ti :8080)
# Check running processes
ps aux | grep -E "(mcp|jwt|composer)"
```
---
## Validation Procedures
### System Health Check
```bash
./scripts/system-health-check.sh
```
#### Health Check Categories:
- **Node.js Environment**: Version, npm availability
- **Playwright**: Browser automation capability
- **Project Files**: package.json, node_modules
- **v4.0.3 System**: Main file, syntax validation
- **JWT Token**: File existence, length validation
- **JWT Server**: File, syntax, running status
- **Modular Infrastructure**: New modules (if implemented)
- **External Services**: EuConquisto platform, API connectivity
- **Backup System**: Scripts, existing backups
### Manual Validation
```bash
# Check JWT token
wc -c archive/authentication/correct-jwt-new.txt
# Should output: 3276
# Check main file syntax
node -c dist/browser-automation-api-direct-save-v4.0.3.js
# Test JWT server
curl -s http://localhost:8080
# Check Node.js version
node --version
# Should be v18 or higher
```
---
## Troubleshooting
### Common Issues
#### "Permission denied" errors
```bash
chmod +x scripts/*.sh
```
#### "Command not found" errors
```bash
# Check PATH includes node and npm
which node
which npm
# Install missing dependencies
npm install
```
#### Port 8080 in use
```bash
# Find what's using the port
lsof -i :8080
# Kill the process if safe
kill $(lsof -ti :8080)
```
#### JWT token issues
```bash
# Check token exists and has correct length
ls -la archive/authentication/correct-jwt-new.txt
wc -c archive/authentication/correct-jwt-new.txt
# Recover from backup if needed
./scripts/recover-jwt-token.sh
```
---
## Emergency Response Plan
### Critical Failure Response (< 30 minutes)
#### Phase 1: Immediate (0-5 minutes)
1. **STOP**: Halt all processes to prevent data corruption
2. **ASSESS**: Determine scope of failure
3. **COMMUNICATE**: Document the issue
#### Phase 2: Recovery (5-20 minutes)
1. **CHOOSE SCENARIO**: Select appropriate recovery procedure
2. **EXECUTE**: Run recovery scripts
3. **VERIFY**: Validate recovery with health checks
#### Phase 3: Validation (20-30 minutes)
1. **TEST**: Run end-to-end functionality tests
2. **MONITOR**: Watch for additional issues
3. **DOCUMENT**: Record incident and lessons learned
### Escalation Procedures
1. **Level 1**: Script-based recovery (automated)
2. **Level 2**: Manual recovery using documentation
3. **Level 3**: Full system rebuild from backups
---
## Backup Schedule
### Automated Backups
- **Before Phase 2**: Mandatory backup before content generation migration
- **Daily**: During active development
- **Pre-deployment**: Before any production changes
### Manual Backups
Create backup before:
- Major code changes
- Configuration updates
- Dependency upgrades
- Migration phase transitions
---
## Recovery Testing
### Monthly Recovery Drills
1. Create test backup
2. Simulate failure scenario
3. Execute recovery procedure
4. Validate system functionality
5. Document any issues
### Recovery Time Objectives (RTO)
- **JWT Token Recovery**: < 2 minutes
- **Configuration Recovery**: < 10 minutes
- **Full System Recovery**: < 15 minutes
- **Complete Rebuild**: < 30 minutes
---
## Contact Information
### Support Resources
- **Migration Documentation**: `/docs/euconquisto-migration/`
- **System Health Check**: `./scripts/system-health-check.sh`
- **Backup Location**: `/backups/migration-safety/`
### Emergency Procedures
1. **First**: Run appropriate recovery script
2. **Second**: Check system health
3. **Third**: Test functionality
4. **Finally**: Document the incident
---
**Remember**: When in doubt, run the health check first to understand the current system state before attempting any recovery procedures.