TDZ C64 Knowledge

Overview Schema Related Servers Score Discussions

tdz-c64-knowledge
docs

TESTING.md•11.4 KiB

# Testing Strategy & Documentation Comprehensive testing documentation for tdz-c64-knowledge v2.21.0 and later. --- ## Overview The testing strategy consists of 4 levels: 1. **Unit Tests** - Individual component testing 2. **Performance Regression Tests** - Validate optimizations don't regress 3. **End-to-End Integration Tests** - Complete workflow testing 4. **Load Tests** - Scalability and performance validation --- ## Test Suites ### 1. Unit Tests **File**: `test_anomaly_detector.py` **Purpose**: Test individual components of anomaly detection system **Coverage**: 15 tests covering core functionality **Run Command**: ```bash .venv/Scripts/python.exe -m pytest test_anomaly_detector.py -v ``` **Test Categories**: - Initialization (normal & custom parameters) - Single check recording - Batch check recording - Baseline calculation & retrieval - Anomaly score calculation (with/without baseline) - Severity classification - Smart noise filtering - History queries - Edge cases (empty batches, failed checks) - Performance validation **Performance Metrics**: - Test suite runtime: ~9 seconds - All tests passing: 15/15 - Coverage: Core functionality, batch operations, edge cases **Example Output**: ``` 15 passed, 3 warnings in 8.82s ``` --- ### 2. Performance Regression Tests **File**: `test_performance_regression.py` **Purpose**: Ensure performance optimizations remain effective **Coverage**: 7 tests validating performance baselines **Run Command**: ```bash .venv/Scripts/python.exe test_performance_regression.py ``` **Performance Baselines**: - Batch recording (100 checks): < 0.1s - Baseline updates (100 docs): < 0.5s - Anomaly scoring (100 scores): < 0.2s - Baseline retrieval (50 queries): < 0.1s - Bulk baseline retrieval (100+): < 0.1s - History query (100+ entries): < 0.05s - Full monitoring cycle (50 docs): < 1.0s **Actual Performance** (as of v2.21.0): - Batch recording: **6787 checks/sec** (67x above baseline) - Baseline updates: **9450 updates/sec** (47x above baseline) - Anomaly scoring: **17265 scores/sec** (86x above baseline) - Full cycle: **2777 docs/sec** (55x above baseline) **Failure Conditions**: - Any test taking longer than baseline threshold - Used for CI/CD to catch performance regressions **Example Output**: ``` [PASS] Batch recording: 0.015s (6787 checks/sec) [PASS] Baseline updates: 0.011s (9450 updates/sec) [PASS] Anomaly scoring: 0.006s (17265 scores/sec) [PASS] Full monitoring cycle (50 docs): 0.018s (2777 docs/sec) ``` --- ### 3. End-to-End Integration Tests **File**: `test_e2e_integration.py` **Purpose**: Test complete workflows from start to finish **Coverage**: 4 tests covering real-world scenarios **Run Command**: ```bash .venv/Scripts/python.exe test_e2e_integration.py ``` **Test Scenarios**: #### Test 1: New Document Baseline Learning - Create new monitored document - Record initial checks (no baseline) - Build baseline with 5+ checks - Validate normal vs anomalous scoring - Verify baseline was learned **Validates**: Complete workflow from document creation to anomaly detection #### Test 2: Batch Monitoring Workflow - Monitor 20 documents simultaneously - Build baselines over simulated 7 days - Process mixed results (unchanged/changed/failed) - Calculate anomaly scores - Verify all baselines created **Validates**: Batch processing with multiple documents #### Test 3: Pattern Filtering Integration - Test timestamp change filtering - Test real content change detection - Add custom ignore pattern - Verify custom pattern works **Validates**: Smart noise filtering integration #### Test 4: Anomaly History Tracking - Record varied check history - Query complete history - Verify status breakdowns - Query anomalies by severity **Validates**: Complete audit trail functionality **Example Output**: ``` 4 passed, 3 warnings in 8.09s E2E TEST COVERAGE: [OK] New document -> baseline learning -> anomaly detection [OK] Batch monitoring workflow (20 documents) [OK] Pattern filtering integration [OK] Complete anomaly history tracking ``` --- ### 4. Load Testing **File**: `load_test.py` **Purpose**: Validate scalability with large document counts **Coverage**: Performance testing with 100-1000+ documents **Run Commands**: ```bash # Quick test (100 documents) .venv/Scripts/python.exe load_test.py --docs 100 # Standard test (1000 documents) .venv/Scripts/python.exe load_test.py --docs 1000 # Custom size .venv/Scripts/python.exe load_test.py --docs 5000 # Quiet mode (less output) .venv/Scripts/python.exe load_test.py --docs 1000 --quiet ``` **Test Phases**: **Phase 1**: Document Creation - Create N test documents - Batch insert into database - Measure throughput **Phase 2**: Batch Recording - Record 1000 checks - Measure throughput - Target: >1000 checks/sec **Phase 3**: Baseline Updates - Update 1000 baselines - Measure throughput - Target: >200 updates/sec **Phase 4**: Score Calculation - Calculate 1000 anomaly scores - Measure throughput - Target: >500 scores/sec **Phase 5**: Full Monitoring Cycle - Complete workflow for 1000 docs - Includes recording + scoring - Target: >50 docs/sec **Phase 6**: Query Performance - Test baseline retrieval - Test history queries - Test anomaly queries **Performance Targets**: ``` Batch recording: >1000 checks/sec Baseline updates: >200 updates/sec Score calculation: >500 scores/sec Full cycle: >50 docs/sec ``` **Example Output**: ``` ================================================================== LOAD TEST: 100 Documents ================================================================== [Phase 1] Creating 100 documents... [OK] Created 100 documents in 0.00s [Phase 2] Batch recording 100 checks... [OK] Recorded 100 checks in 0.015s Throughput: 6787 checks/sec [PASS] Target: >1000 checks/sec [Phase 3] Updating 100 baselines... [OK] Updated 100 baselines in 0.009s Throughput: 10811 updates/sec [PASS] Target: >200 updates/sec [Phase 4] Calculating 100 anomaly scores... [OK] Calculated 100 scores in 0.003s Throughput: 35336 scores/sec [PASS] Target: >500 scores/sec [Phase 5] Full monitoring cycle (100 documents)... [OK] Completed monitoring cycle in 0.017s Record checks: 0.015s Calculate scores: 0.002s Throughput: 5787 docs/sec [PASS] Target: >50 docs/sec full cycle [Phase 6] Query performance tests... [OK] Retrieved 100 baselines in 0.000s [OK] Retrieved history in 0.000s [OK] Retrieved anomalies in 0.000s ================================================================== LOAD TEST SUMMARY ================================================================== Documents tested: 100 Performance Results: Batch recording: 6787 checks/sec Baseline updates: 10811 updates/sec Score calculation: 35336 scores/sec Full cycle: 5787 docs/sec [PASS] All performance targets met ``` --- ## Running All Tests ### Quick Validation ```bash # Run all unit tests .venv/Scripts/python.exe -m pytest test_anomaly_detector.py -v # Run E2E tests .venv/Scripts/python.exe -m pytest test_e2e_integration.py -v # Quick load test .venv/Scripts/python.exe load_test.py --docs 100 ``` ### Full Test Suite ```bash # Unit tests .venv/Scripts/python.exe -m pytest test_anomaly_detector.py -v # Performance regression tests .venv/Scripts/python.exe test_performance_regression.py # E2E integration tests .venv/Scripts/python.exe test_e2e_integration.py # Load test (1000 documents) .venv/Scripts/python.exe load_test.py --docs 1000 ``` ### CI/CD Pipeline ```bash # Run all tests with coverage pytest test_anomaly_detector.py test_e2e_integration.py -v --cov=anomaly_detector --cov-report=term # Run performance regression tests python test_performance_regression.py # Run load test (quick) python load_test.py --docs 100 --quiet ``` --- ## Test Environment ### Setup All tests use isolated temporary databases to avoid affecting production data. **Environment Variables** (automatically set by tests): ```bash USE_SEMANTIC_SEARCH=0 # Disable for faster tests USE_FTS5=1 # Enable FTS5 search ``` ### Test Data - Tests create temporary documents in `tempfile.mkdtemp()` - Automatic cleanup after tests - No impact on production database ### Dependencies - pytest - tempfile (stdlib) - datetime (stdlib) - server.py (KnowledgeBase) - anomaly_detector.py (AnomalyDetector) - migration_v2_21_0.py (database migration) --- ## Performance Benchmarks ### v2.21.0 Performance (as of 2025-12-22) **Hardware**: Windows 11, Python 3.14 **Unit Tests**: - 15 tests in 8.82s - Average: 0.59s per test **Performance Regression Tests**: - 7 tests in 0.44s - All baselines exceeded by 40-80x **E2E Integration Tests**: - 4 tests in 8.09s - Average: 2.02s per test **Load Test (100 docs)**: - Total time: ~0.5s - Batch recording: 6787 checks/sec - Full cycle: 5787 docs/sec **Load Test (1000 docs)** (estimated): - Total time: ~5s - Batch recording: ~7000 checks/sec - Full cycle: ~200 docs/sec --- ## Continuous Integration ### GitHub Actions Workflow ```yaml name: Tests on: [push, pull_request] jobs: test: runs-on: windows-latest steps: - uses: actions/checkout@v2 - uses: actions/setup-python@v2 with: python-version: '3.10' - run: pip install -e ".[dev]" - run: pytest test_anomaly_detector.py test_e2e_integration.py -v - run: python test_performance_regression.py - run: python load_test.py --docs 100 --quiet ``` --- ## Troubleshooting ### Tests Fail with "Database locked" **Cause**: Concurrent access to test database **Solution**: Tests use separate temporary databases, shouldn't occur ### Performance Tests Fail **Cause**: System under load or insufficient resources **Solution**: Close other applications, run again ### Load Test Timeouts **Cause**: Testing with very large document counts (>10000) **Solution**: Reduce document count or increase timeout ### Import Errors **Cause**: Missing dependencies **Solution**: ```bash pip install -e ".[dev]" ``` --- ## Adding New Tests ### Unit Test Template ```python def test_new_feature(self): """Test new feature description.""" detector = AnomalyDetector(self.kb) # Setup # ... # Execute result = detector.new_method() # Assert assert result == expected print(f"[PASS] New feature test") ``` ### Performance Test Template ```python def test_new_performance(self): """Test new performance requirement.""" import time # Setup # ... # Measure start = time.time() # ... operation ... elapsed = time.time() - start # Assert baseline assert elapsed < 1.0, f"Took {elapsed:.3f}s, baseline is 1.0s" ``` --- ## Best Practices 1. **Isolation**: Each test should be independent 2. **Cleanup**: Always cleanup temporary files/databases 3. **Performance**: Set realistic baselines based on hardware 4. **Coverage**: Test happy path, edge cases, and errors 5. **Documentation**: Add docstrings explaining what each test validates 6. **Assertions**: Include helpful failure messages 7. **Output**: Print progress for long-running tests --- ## Version History - **v2.21.0**: Added complete test suite (unit, performance, E2E, load) - **v2.20.0**: Basic test_server.py only --- ## Future Improvements - Add code coverage reporting (target: >80%) - Add mutation testing - Add stress tests (extreme loads, resource exhaustion) - Add security tests (SQL injection, path traversal) - Add concurrent access tests - Add database corruption recovery tests --- **Maintained by**: Claude Code **Last Updated**: 2025-12-22 **Version**: 2.21.0

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MichaelTroelsen/tdz-c64-knowledge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

TESTING.md•11.4 KiB