Session Buddy

README.md•13.7 KiB

# Session-mgmt-mcp Test Infrastructure This directory contains a comprehensive, modern async pytest testing infrastructure for the session-buddy project. ## Architecture ### Core Files - **`conftest.py`**: Global test configuration with async fixtures and session management - **`pytest.ini`**: Pytest configuration with async support, coverage, and comprehensive markers - **`helpers.py`**: Shared test utilities, data factories, and helper classes ### Test Organization ``` tests/ ├── conftest.py # Global fixtures and configuration ├── pytest.ini # Pytest configuration ├── helpers.py # Shared utilities and factories ├── functional/ # Complete workflow tests ├── unit/ # Individual component tests └── integration/ # Cross-component tests ``` ## Key Features ### 1. Modern Async Support - **pytest-asyncio** integration with proper event loop management - Session-scoped event loop for consistency across tests - Automatic async task leak detection - Context manager support for async resources ### 2. Comprehensive Fixtures #### Database Fixtures ```python @pytest.fixture async def reflection_db(temp_db_path: str) -> AsyncGenerator[ReflectionDatabase, None]: """Provide initialized ReflectionDatabase instance.""" @pytest.fixture async def reflection_db_with_data( reflection_db: ReflectionDatabase, ) -> AsyncGenerator[ReflectionDatabase, None]: """Provide ReflectionDatabase with test data.""" ``` #### Environment Fixtures ```python @pytest.fixture def clean_environment() -> Generator[dict[str, Any], None, None]: """Provide clean environment with common patches.""" @pytest.fixture async def temp_claude_dir() -> AsyncGenerator[Path, None]: """Provide temporary ~/.claude directory structure.""" ``` #### Mocking Fixtures ```python @pytest.fixture def mock_onnx_session() -> Mock: """Provide mock ONNX session for embedding tests.""" @pytest.fixture def mock_embeddings_disabled(): """Fixture to disable embeddings for testing fallback behavior.""" ``` ### 3. Test Helper Classes #### TestDataFactory Generate realistic test data: ```text # Individual conversation conversation = TestDataFactory.conversation("Test content", "project-name") # Bulk conversations conversations = TestDataFactory.bulk_conversations(50, "bulk-project") # Reflections with tags reflection = TestDataFactory.reflection("Important insight", ["tag1", "tag2"]) ``` #### DatabaseTestHelper Database testing utilities: ```text # Temporary database async with DatabaseTestHelper.temp_reflection_db() as db: # Use database for testing # Populate with test data data_ids = await DatabaseTestHelper.populate_test_data(db, 10, 5) # Performance measurement perf_data = await DatabaseTestHelper.measure_query_performance(db, search_func, "query") ``` #### MockingHelper Comprehensive mocking utilities: ```text # Mock embedding system mocks = MockingHelper.mock_embedding_system() # Mock MCP server async with MockingHelper.mock_mcp_server() as server: # Use mocked server # Environment patching with MockingHelper.patch_environment(TEST_VAR="value"): # Test with environment variables ``` #### AssertionHelper Specialized assertions: ```text # UUID validation AssertionHelper.assert_valid_uuid(conversation_id) # Timestamp validation AssertionHelper.assert_valid_timestamp(iso_timestamp) # Embedding validation AssertionHelper.assert_embedding_shape(embedding_vector) # Similarity score validation AssertionHelper.assert_similarity_score(0.85) # Database record validation AssertionHelper.assert_database_record(record, ["id", "content", "project"]) ``` #### PerformanceHelper Performance testing utilities: ```text # Time measurement async with PerformanceHelper.measure_time() as measurements: await expensive_operation() # Performance thresholds PerformanceHelper.assert_performance_threshold( actual_time, 1.0, # 1 second threshold "operation description", ) # Benchmarking benchmark_results = await PerformanceHelper.benchmark_async_operation( operation, iterations=100 ) ``` ### 4. Test Categories and Markers #### Test Categories - **`unit`**: Fast, isolated component tests - **`integration`**: Cross-component interaction tests - **`functional`**: Complete workflow tests - **`performance`**: Performance and benchmark tests - **`security`**: Security and validation tests #### Specialized Markers - **`async_test`**: Requires async event loop - **`database`**: Requires database operations - **`embedding`**: Requires embedding/vector operations - **`mcp`**: Tests MCP server functionality - **`slow`**: Long-running tests (use `-m "not slow"` to skip) ### 5. Running Tests #### Basic Commands ```bash # All tests with coverage python -m pytest # Specific test categories python -m pytest -m unit # Unit tests only python -m pytest -m integration # Integration tests only python -m pytest -m "not slow" # Skip slow tests # Specific test files python -m pytest tests/functional/test_simple_validation.py -v # Async tests only python -m pytest -k "async" -v ``` #### Advanced Usage ```bash # Parallel execution python -m pytest -n auto # With performance benchmarking python -m pytest --benchmark-only # Coverage with branch analysis python -m pytest --cov-branch --cov-report=html # Timeout for long-running tests python -m pytest --timeout=300 ``` ## Example Usage ### Unit Test Example ```text class TestExampleUnit: def test_data_generation(self): """Example unit test with test data factory.""" conversation = TestDataFactory.conversation("Test content", "test-project") AssertionHelper.assert_valid_uuid(conversation["id"]) assert conversation["content"] == "Test content" assert conversation["project"] == "test-project" def test_with_mocking(self): """Example unit test with mocking.""" mocks = MockingHelper.mock_embedding_system() # Use mocked embedding system assert mocks["onnx_session"] is not None assert mocks["tokenizer"] is not None ``` ### Integration Test Example ```text class TestExampleIntegration: @pytest.mark.integration @pytest.mark.asyncio async def test_database_workflow(self): """Example integration test.""" async with DatabaseTestHelper.temp_reflection_db() as db: # Store conversation conv_id = await db.store_conversation( "Integration test conversation", {"project": "integration-test"} ) # Validate stored data AssertionHelper.assert_valid_uuid(conv_id) # Verify retrieval results = db.conn.execute( "SELECT content FROM conversations WHERE id = ?", [conv_id] ).fetchall() assert len(results) == 1 assert results[0][0] == "Integration test conversation" ``` ### Performance Test Example ```text class TestPerformance: @pytest.mark.performance @pytest.mark.asyncio async def test_bulk_operations(self, perf_helper): """Example performance test.""" async with DatabaseTestHelper.temp_reflection_db() as db: # Measure performance async with perf_helper.measure_time() as measurements: for i in range(100): await db.store_conversation(f"Perf test {i}", {"project": "perf"}) # Assert performance threshold perf_helper.assert_performance_threshold( measurements["duration"], 5.0, # 5 seconds max for 100 operations "bulk conversation storage", ) ``` ## Configuration ### pytest.ini Features - Async mode enabled (`asyncio_mode = auto`) - Comprehensive test markers defined - Coverage reporting with branch analysis - Warning filters for common libraries - Timeout protection (300s default) - Test discovery patterns ### Environment Variables - `TESTING=1`: Automatically set during test execution - `LOG_LEVEL=DEBUG`: Verbose logging for debugging - Custom variables via `clean_environment` fixture ## Best Practices ### 1. Test Organization - Use appropriate test categories (`unit`, `integration`, `functional`) - Add descriptive markers for filtering - Keep tests focused and independent ### 2. Async Testing - Always use `@pytest.mark.asyncio` for async tests - Use proper async fixtures and context managers - Test both happy path and error conditions ### 3. Database Testing - Use temporary databases for isolation - Clean up resources in finally blocks - Test concurrent access patterns ### 4. Performance Testing - Set realistic performance thresholds - Measure actual operations, not just time - Test with realistic data volumes ### 5. Mocking Strategy - Mock external dependencies (ONNX, file system) - Use helper utilities for consistent mocking - Test both with mocks and real components when possible ## Validation The test infrastructure is validated through comprehensive tests in: - `tests/functional/test_simple_validation.py`: Infrastructure validation - `tests/unit/test_example_unit.py`: Unit test examples - `tests/integration/test_example_integration.py`: Integration test examples All tests pass and demonstrate the infrastructure works correctly for: ✅ Async database operations ✅ Test data generation ✅ Mocking and patching ✅ Assertion helpers ✅ Performance measurement ✅ Environment management ✅ Concurrent testing ## Advanced Testing Patterns ### Property-Based Testing with Hypothesis ```python from hypothesis import given, strategies as st, settings @given( content=st.text(min_size=1, max_size=10000), tags=st.lists(st.text(min_size=1, max_size=50), max_size=10), ) @settings(max_examples=50) @pytest.mark.asyncio async def test_reflection_storage_properties(content, tags, reflection_db): """Property-based test ensuring reflection storage is robust""" # Property: All valid content should store successfully result = await reflection_db.store_reflection( content=content, tags=tags, project="property-test" ) assert isinstance(result, str) assert len(result) > 0 ``` ### Performance Regression Detection ```python @pytest.mark.performance @pytest.mark.asyncio async def test_with_performance_tracking(): """Performance testing with baseline comparison""" tracker = AdvancedPerformanceTracker() tracker.start() # Performance-sensitive operation start_time = time.perf_counter() await some_expensive_operation() operation_time = time.perf_counter() - start_time # Record structured performance metric tracker.record_metric( PerformanceMetric( name="expensive_operation_time", value=operation_time, unit="seconds", threshold=1.0, # 1 second threshold ) ) metrics = tracker.stop() analysis = tracker.analyze_regressions() # Assert no performance regressions assert analysis["summary"]["overall_status"] == "PASS" ``` ### Security Testing Patterns ```python @pytest.mark.security @pytest.mark.parametrize( "malicious_input", [ "'; DROP TABLE reflections; --", "<script>alert('xss')</script>", "../../etc/passwd", "\x00\x01\x02\x03", # Binary injection "A" * 10000, # Buffer overflow attempt ], ) @pytest.mark.asyncio async def test_input_sanitization(malicious_input, reflection_db): """Test system handles malicious input safely""" try: result = await reflection_db.store_reflection( content=malicious_input, tags=["security-test"] ) # If it succeeds, verify no SQL injection occurred if isinstance(result, str): stats = await reflection_db.get_stats() assert stats["total_reflections"] >= 1 except (ValueError, TypeError) as e: # Acceptable to reject invalid input assert "invalid" in str(e).lower() ``` ### Integration Workflow Testing ```python @pytest.mark.integration @pytest.mark.asyncio async def test_complete_session_lifecycle(tmp_path): """Test end-to-end session workflow""" from session_buddy.server import init, checkpoint, end, status working_dir = str(tmp_path) # Phase 1: Initialize session init_result = await init(working_directory=working_dir) assert init_result["success"] is True session_id = init_result["session_id"] # Phase 2: Verify session is active status_result = await status(working_directory=working_dir) assert status_result["session"]["session_id"] == session_id # Phase 3: Create checkpoint checkpoint_result = await checkpoint() assert checkpoint_result["success"] is True # Phase 4: End session end_result = await end() assert end_result["success"] is True ``` ## Quality Assurance Standards ### Coverage Requirements - **Minimum**: 85% line coverage (enforced) - **Target**: 90% line coverage for critical components - **Branch coverage**: Monitor but don't enforce (aim for 80%) - **Exclude patterns**: Test files, debug code, platform-specific code ### Performance Benchmarks - **Database operations**: < 100ms average, < 500ms P95 - **MCP tool execution**: < 200ms average, < 1s P95 - **Memory growth**: < 50MB per 1000 operations - **Concurrent access**: Support 50+ simultaneous operations ### Security Standards - **All inputs validated**: No raw user input reaches database - **SQL injection prevention**: Parameterized queries only - **Rate limiting**: Prevent abuse through excessive requests - **Permission boundaries**: Enforce access controls consistently ## Future Enhancements Potential additions to consider: - Mutation testing integration - Chaos engineering tests - Custom pytest plugins for MCP testing - Test parallelization optimization

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/lesleslie/session-buddy'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•13.7 KiB