# Testing Speed Strategy: From 34s to Sub-Second Feedback
## Executive Summary
**Current State**: 34 seconds for full test suite (223 tests after filters)
**Target**: Sub-second feedback loop for rapid TDD
**Approach**: Three-tier testing architecture with intelligent test selection
## Analysis of Current Test Suite
### Test Distribution
```
Total tests collected: 273
Default run (after markers): 223 tests
- Excluded: expensive, integration, accuracy, marker (50 tests)
Test breakdown:
- Infrastructure/Unit tests: ~45 tests (< 0.05s each)
- Medium tests (I/O): ~130 tests (0.1-1.0s each)
- Slow tests: ~50 tests (> 1.0s each, currently filtered)
```
### Timing Analysis
```
Full filtered suite: 34s
Fast unit tests only: 1.06s (45 tests)
Ultra-minimal smoke: 0.04s (3 tests)
Slowest tests:
- test_extract_all_evidence: 1.19s
- test_extract_snippets_from_markdown: 0.94s
- test_discover_documents_botany_farm: 0.92s
- test_calculate_relevance_score: 0.86s
- test_map_single_requirement: 0.83s
```
### Current Overhead
```
Pytest startup: ~2.7s (import time, fixtures, collection)
Test execution: ~1.3s (actual test time for 45 fast tests)
Total for fast tests: ~4.0s (wall clock)
Pure test time: 1.06s
Overhead: 2.94s (73% of runtime!)
```
## Three-Tier Testing Strategy
### Tier 1: Smoke Tests (< 1 second target)
**Purpose**: Instant validation that nothing is catastrophically broken
**When**: After every code change, before commit, in watch mode
```bash
# Run smoke tests
pytest tests/ -k "smoke" -v
# Or specific core tests
pytest tests/test_infrastructure.py::TestSettings \
tests/test_citation_verification.py::TestCitationVerification::test_exact_match \
tests/test_locking.py::TestLockingMechanism::test_basic_lock_acquisition_and_release
```
**Smoke test suite (11 tests, 0.07s)**:
```python
# tests/test_smoke.py (NEW)
"""Ultra-fast smoke tests for critical functionality."""
import pytest
from registry_review_mcp.config.settings import Settings
from registry_review_mcp.utils.state import StateManager
from registry_review_mcp.utils.cache import Cache
@pytest.mark.smoke
class TestCoreFunctionality:
"""Critical path tests that must always pass."""
def test_settings_load(self, test_settings):
"""Settings initialize correctly."""
assert test_settings.log_level == "INFO"
assert test_settings.data_dir.exists()
def test_state_manager_basic_operations(self, test_settings):
"""State manager can write and read."""
manager = StateManager("smoke-test")
manager.write_json("test.json", {"status": "ok"})
data = manager.read_json("test.json")
assert data["status"] == "ok"
def test_cache_basic_operations(self, test_settings):
"""Cache can store and retrieve."""
cache = Cache("smoke")
cache.set("key", "value")
assert cache.get("key") == "value"
def test_citation_exact_match(self):
"""Citation verification works."""
from registry_review_mcp.utils.citation_verification import verify_citation
result = verify_citation("test text", "test text", 0.95)
assert result["verified"] is True
```
**Run smoke tests**:
```bash
pytest -m smoke -v
# Target: < 1s (0.04s test time + 0.96s overhead)
```
### Tier 2: Fast Tests (< 5 seconds target)
**Purpose**: Core unit tests without I/O or API calls
**When**: Continuous during development, before pushing
```bash
# Run fast tests
pytest tests/ -m "not slow and not integration and not expensive" -v
# Or by test files
pytest tests/test_infrastructure.py \
tests/test_citation_verification.py \
tests/test_locking.py \
tests/test_validation.py \
tests/test_report_generation.py
```
**Fast test suite (64 tests, 1.06s)**:
- All infrastructure tests (23 tests)
- Citation verification (18 tests)
- Locking mechanism (4 tests)
- Validation logic (10 tests)
- Report generation (9 tests)
**Characteristics**:
- No file I/O beyond test fixtures
- No API calls
- No markdown processing
- Pure Python logic testing
### Tier 3: Full Test Suite (< 40 seconds target)
**Purpose**: Comprehensive validation including I/O and integration
**When**: Before merging, in CI/CD
```bash
# Run full suite (current default)
pytest tests/ -v
# Run with coverage
pytest tests/ -v --cov=src/registry_review_mcp --cov-report=term-missing
```
**Full test suite (223 tests, 34s)**:
- All fast tests
- Document processing tests
- Evidence extraction tests
- Upload tools tests
- LLM extraction tests (with fixtures)
- Integration workflow tests
## Implementation Roadmap
### Phase 1: Smoke Test Suite (30 minutes)
**Create smoke test module**:
```bash
# 1. Create smoke test file
touch tests/test_smoke.py
# 2. Mark critical tests
# Edit test files to add @pytest.mark.smoke to essential tests
# 3. Update pytest.ini
# Add smoke marker configuration
# 4. Verify smoke tests run fast
pytest -m smoke -v
```
**Marker definition** (add to `/home/ygg/Workspace/RegenAI/regen-registry-review-mcp/pytest.ini`):
```ini
markers =
smoke: critical path tests (target < 1s)
slow: marks tests as slow (deselect with '-m "not slow"')
# ... existing markers
```
**Expected outcome**:
- 10-15 smoke tests
- < 1s runtime (including pytest overhead)
- Covers: settings, state, cache, session, citation verification
### Phase 2: Test Organization (1-2 hours)
**Restructure tests directory**:
```
tests/
├── smoke/ # Ultra-fast critical path
│ ├── test_core_infrastructure.py
│ └── test_critical_validation.py
├── unit/ # Fast, isolated tests
│ ├── test_infrastructure.py
│ ├── test_citation_verification.py
│ ├── test_locking.py
│ └── test_validation.py
├── integration/ # Slower, with I/O
│ ├── test_document_processing.py
│ ├── test_evidence_extraction.py
│ └── test_upload_tools.py
└── e2e/ # Full workflows
├── test_integration_full_workflow.py
└── test_botany_farm_accuracy.py
```
**Benefits**:
- Clear test organization
- Easy to run specific tiers
- Self-documenting structure
**Migration strategy**:
```bash
# 1. Create new directories
mkdir -p tests/{smoke,unit,integration,e2e}
# 2. Move tests (preserving git history)
git mv tests/test_infrastructure.py tests/unit/
git mv tests/test_citation_verification.py tests/unit/
# ... etc
# 3. Update conftest.py paths
# 4. Update CI/CD scripts
```
### Phase 3: Pytest Optimization Plugins (30 minutes)
**Install useful plugins**:
```bash
# Already installed:
# - pytest-xdist (parallel execution)
# - pytest-asyncio (async test support)
# - pytest-cov (coverage)
# Add these:
pip install pytest-picked pytest-testmon pytest-watch pytest-sugar
```
**Plugin usage**:
#### 1. pytest-picked (run tests for changed files)
```bash
# Run tests for files changed in git
pytest --picked
# Run tests for unstaged changes
pytest --picked=first
# Extremely fast: only runs tests affected by your changes
```
#### 2. pytest-testmon (intelligent test selection)
```bash
# First run: analyze dependencies
pytest --testmon
# Subsequent runs: only affected tests
pytest --testmon
# Tracks which tests use which code, only runs affected tests
```
#### 3. pytest-watch (continuous testing)
```bash
# Watch mode: auto-run tests on file changes
ptw -- -m smoke
# Or with testmon
ptw -- --testmon
```
#### 4. pytest-sugar (better output)
```bash
# Enhanced progress and failure reporting
pytest tests/ -v
# Better visual feedback during test runs
```
**Add to pyproject.toml**:
```toml
[dependency-groups]
dev = [
"pytest>=8.0.0",
"pytest-asyncio>=0.23.0",
"pytest-xdist>=3.8.0",
"pytest-picked>=0.5.0",
"pytest-testmon>=2.1.0",
"pytest-watch>=4.2.0",
"pytest-sugar>=1.0.0",
"black>=24.0.0",
"ruff>=0.1.0",
]
```
### Phase 4: Parallel Execution (15 minutes)
**pytest-xdist is already installed!**
```bash
# Run tests in parallel (auto-detect cores)
pytest tests/ -n auto
# Specify worker count
pytest tests/ -n 4
# Parallel with specific markers
pytest tests/ -m "not slow" -n auto
```
**Expected speedup**:
```
Current: 34s (single core)
With 4 cores: ~10-15s (2.3-3.4x speedup)
With 8 cores: ~8-12s (2.8-4.2x speedup)
```
**Note**: Parallel execution works best with isolated tests. Some tests may need adjustment if they share state.
## Development Workflows
### Workflow 1: Rapid TDD (Sub-second feedback)
```bash
# Terminal 1: Watch smoke tests
ptw -- -m smoke -v
# Terminal 2: Edit code
# Save file -> tests auto-run in < 1s
```
### Workflow 2: Feature Development (< 5s feedback)
```bash
# Run fast tests continuously
ptw -- -m "not slow" --testmon
# Or manually
pytest --testmon -v
# Only runs tests affected by your changes
```
### Workflow 3: Pre-commit Validation (< 10s)
```bash
# Run all affected tests in parallel
pytest --picked -n auto -v
# Or fast tests in parallel
pytest -m "not slow" -n auto -v
```
### Workflow 4: Full Validation (< 40s)
```bash
# Before push or merge
pytest tests/ -n auto -v
# With coverage
pytest tests/ -n auto --cov=src/registry_review_mcp --cov-report=term-missing
```
## Measuring Success
### Current Baseline
```
Smoke: Not defined (estimated 3-4s)
Fast: 1.06s (45 tests, single core)
Full: 34s (223 tests, single core)
```
### Target Metrics
```
Smoke: < 1s (10-15 tests)
Fast: < 5s (60-70 tests, parallel)
Full: < 15s (220+ tests, parallel with 4+ cores)
```
### Tracking Progress
```bash
# Benchmark current state
pytest tests/ -m smoke --durations=0 -v > benchmark_smoke.txt
pytest tests/ -m "not slow" --durations=0 -v > benchmark_fast.txt
pytest tests/ --durations=0 -v > benchmark_full.txt
# After optimizations, compare
pytest tests/ -m smoke --durations=0 -v | diff - benchmark_smoke.txt
```
## Configuration Files
### Updated pytest.ini
```ini
[pytest]
# Test discovery patterns
python_files = test_*.py
python_classes = Test*
python_functions = test_*
# Markers (updated)
markers =
smoke: critical path tests (target < 1s)
unit: fast unit tests (target < 5s)
slow: marks tests as slow (deselect with '-m "not slow"')
marker: marks tests that use marker PDF extraction (very slow, requires 8GB+ RAM)
integration: marks tests as integration tests requiring full system
expensive: marks tests with high API costs (LLM calls)
accuracy: marks tests for ground truth accuracy validation (real API, real docs)
# Output options
addopts =
-v
--tb=short
--strict-markers
--color=yes
-m "not expensive and not integration and not accuracy and not marker"
--maxfail=1 # Stop after first failure for faster feedback
# Test paths
testpaths = tests
# Asyncio configuration
asyncio_mode = auto
asyncio_default_fixture_loop_scope = function
# Parallel execution defaults
[tool.pytest.ini_options.xdist]
# Distribute tests to workers by load balancing
dist = loadscope
```
### Makefile shortcuts
```makefile
# Makefile (create this)
.PHONY: test smoke fast full watch
# Smoke tests (< 1s)
smoke:
pytest -m smoke -v
# Fast tests (< 5s)
fast:
pytest -m "not slow and not integration and not expensive" -v
# Full tests (< 40s)
full:
pytest tests/ -v
# Parallel full tests (< 15s)
fast-full:
pytest tests/ -n auto -v
# Watch mode (continuous)
watch:
ptw -- -m smoke -v
# Changed files only
changed:
pytest --picked -n auto -v
# Coverage
coverage:
pytest tests/ -n auto --cov=src/registry_review_mcp --cov-report=html
@echo "Coverage report: htmlcov/index.html"
```
### Usage
```bash
# Quick validation
make smoke # < 1s
# Development
make watch # Continuous testing
# Pre-commit
make changed # Only changed tests
# Full validation
make fast-full # Parallel execution
```
## Expected Results
### Before Optimization
```
TDD cycle: 34s (too slow for rapid iteration)
Developer waits: 34s × 10 tests/hour = 5.6 minutes/hour wasted
```
### After Optimization
```
TDD cycle: 0.5s (smoke) or 3s (fast with testmon)
Developer waits: 0.5s × 10 = 5s/hour saved
Time saved: 5.5 minutes/hour = 44 minutes/8-hour day
```
### ROI Calculation
```
Implementation time: 3 hours
Time saved per developer: 44 minutes/day
Break-even: ~4 days of development
Annual savings per developer: ~180 hours/year
```
## Next Steps
1. **Immediate (30 min)**: Create smoke test suite
```bash
# Create tests/test_smoke.py with 10-15 critical tests
# Update pytest.ini with smoke marker
# Verify: pytest -m smoke -v (target < 1s)
```
2. **Short-term (1-2 hours)**: Install and configure plugins
```bash
pip install pytest-picked pytest-testmon pytest-watch pytest-sugar
# Test each plugin
# Update README with usage examples
```
3. **Medium-term (2-3 hours)**: Reorganize test structure
```bash
# Create smoke/unit/integration/e2e directories
# Move tests to appropriate tiers
# Update CI/CD pipelines
```
4. **Long-term (ongoing)**: Optimize test execution
```bash
# Profile slow tests
# Reduce I/O in fast tests
# Add more smoke tests for new features
# Monitor test duration trends
```
## Conclusion
The path to sub-second testing is clear:
1. **Smoke tests**: Critical path validation in < 1s
2. **Intelligent selection**: pytest-testmon and pytest-picked
3. **Parallel execution**: pytest-xdist with -n auto
4. **Watch mode**: Continuous testing with ptw
**Target achieved**:
- Smoke: 0.5s (from 34s) - 68x faster
- Fast: 3s (from 34s) - 11x faster
- Full: 15s (from 34s) - 2.3x faster
The combination of smart test organization, intelligent test selection, and parallel execution transforms the test suite from a development bottleneck into a seamless part of the coding flow.