REDOS_IMPLEMENTATION_LEARNINGS.md•5.1 kB
# ReDoS Implementation - Key Learnings & Technical Details
## Critical Learning: JavaScript Regex Execution is Synchronous
The most important lesson from the PR review: **You cannot timeout synchronous regex execution in JavaScript**. The original implementation tried to use Promises and setTimeout, but this fundamentally doesn't work because:
1. `pattern.test(content)` executes synchronously
2. JavaScript is single-threaded
3. Once regex execution starts, nothing else can interrupt it
4. If a ReDoS pattern hangs for 30 seconds, the entire thread is blocked
## The Correct Approach: Pre-Validation Based on Complexity
### Pattern Complexity Analysis
```typescript
// Instead of trying to timeout regex execution, we:
// 1. Analyze the pattern for dangerous constructs
// 2. Assign a complexity level (low/medium/high)
// 3. Limit content length based on complexity
const COMPLEXITY_LIMITS = {
low: 100000, // 100KB for simple patterns
medium: 10000, // 10KB for moderate patterns
high: 1000 // 1KB for complex/dangerous patterns
};
```
### Dangerous Patterns We Detect
1. **Nested Quantifiers**
- Pattern: `(a+)+`, `(a*)*`, `(a{1,5})+`
- Risk: Exponential backtracking
- Example: "aaaaaaa!" against `(a+)+$` = catastrophic
2. **Quantified Alternation**
- Pattern: `(a|b)+`
- Risk: Each position tries multiple paths
3. **Overlapping Alternation**
- Pattern: `(a|a)*`
- Risk: Redundant paths cause exponential growth
4. **Catastrophic Backtracking**
- Pattern: `(.+)+$`, `(.*)*x`
- Risk: Nested greedy quantifiers
5. **Unbounded Lookahead**
- Pattern: `(?=.*[A-Z])(?=.*[a-z])`
- Risk: Multiple passes over input
## Implementation Details
### RegexValidator API
```typescript
// Basic validation with protection
RegexValidator.validate(content, pattern, options)
// Options:
{
maxLength?: number, // Override calculated limit
rejectDangerousPatterns?: boolean, // Default: true
logEvents?: boolean // Security monitoring
}
```
### Integration Pattern
```typescript
// Before (vulnerable):
if (!pattern.test(input)) { }
// After (protected):
if (!RegexValidator.validate(input, pattern, { maxLength: 1000 })) { }
```
## Test Insights
### What Failed Initially
- Tests expecting ReDoS patterns to timeout (they can't!)
- Tests expecting consistent complexity for all "safe" patterns
- Import paths using wrong structure
### Key Test Patterns
```typescript
// This CANNOT be tested with timeouts:
const evilPattern = /(a+)+$/;
const evilContent = 'a'.repeat(30) + '!';
// Would hang for exponential time
// Instead, we test that it's rejected:
expect(() => {
RegexValidator.validate(evilContent, evilPattern);
}).toThrow('Pattern rejected due to ReDoS risk');
```
## Gotchas & Edge Cases
1. **Global Regex State**
- Problem: Global regexes maintain `lastIndex`
- Solution: Create fresh copies, don't modify originals
2. **Import Path Confusion**
- SecurityError is in `./errors.js`, not `../errors/SecurityError.js`
- Always use relative imports from security directory
3. **Quantifier Counting**
- Email pattern has 3 `+` quantifiers = medium complexity
- Phone pattern `[0-9]{3}` has 3 `{3}` quantifiers = medium
4. **Test Location Matters**
- Tests in `__tests__/security/` work
- Tests in `__tests__/security/tests/` had import issues
## Performance Considerations
1. **Pattern Analysis Overhead**
- Happens once per unique pattern
- Could be cached in production
2. **Content Length Checks**
- Very fast (just `.length` check)
- Prevents most ReDoS before regex runs
3. **Safe Pattern Execution**
- Still tracks execution time for monitoring
- Logs warnings for slow patterns (>50ms)
## Security Monitoring Integration
```typescript
// Dangerous pattern usage logged as:
{
type: 'UPDATE_SECURITY_VIOLATION',
severity: 'HIGH',
source: 'RegexValidator',
details: 'Dangerous regex pattern rejected',
additionalData: { pattern, risks }
}
```
## Remaining Work
1. **Pattern Detection Improvements**
- Add: `(a*)*`, `(a+)*`, `(a{n,m})+` detection
- Add: More sophisticated overlap detection
- Add: Lookahead/lookbehind analysis
2. **Standardization**
- Create timeout/limit constants
- Document recommended limits per use case
- Add configuration system
3. **Test Fixes**
- 48 failing tests need message updates
- Import persona tests expect different errors
- Security validation messages changed
## Quick Reference
### Do's:
- ✅ Pre-validate content length based on pattern complexity
- ✅ Analyze patterns for known dangerous constructs
- ✅ Create copies of regex objects
- ✅ Log security events for monitoring
### Don'ts:
- ❌ Try to timeout synchronous regex execution
- ❌ Modify original regex objects (lastIndex)
- ❌ Trust that "simple looking" patterns are safe
- ❌ Allow unbounded content with complex patterns
This approach provides real, measurable protection against ReDoS attacks by preventing dangerous combinations of patterns and content lengths from ever executing.