# Test Suite for Crawl4AI MCP Server
This directory contains comprehensive tests for the Crawl4AI MCP server with database abstraction support.
## Test Suite Status
**Last Updated**: August 3, 2025
### Current Metrics
- **Total Tests**: 216
- **Passing Tests**: 155 (71.8% pass rate)
- **Test Coverage**: 56.71% (Target: 80%)
- **Qdrant Tests**: 100% passing (28/28 tests)
- **Utils Tests**: 100% passing (24/24 tests)
## Test Structure
### Unit Tests
- `test_utils.py` - Tests for utility functions (24 tests, 94% coverage)
- `test_database_factory.py` - Tests for database factory and adapter creation (9 tests, 100% passing)
- `test_crawl4ai_mcp.py` - Tests for MCP tool functions (77 comprehensive tests, 53% coverage)
- `test_mcp_protocol.py` - Protocol compliance tests (16 tests, 100% passing)
### Adapter Tests
- `test_supabase_adapter.py` - Supabase-specific adapter tests with mocks (14 tests, excluded from metrics)
- `test_qdrant_adapter.py` - Qdrant-specific adapter tests with mocks (19 tests, 100% passing, 73% coverage)
- `test_database_interface.py` - Interface contract tests that both adapters must pass (18 tests, 9/9 Qdrant tests passing)
### Integration Tests
- `test_integration.py` - Full integration tests with real Docker containers (Docker port conflicts)
- `test_integration_simple.py` - Integration tests with mocked dependencies (5 tests, 100% passing)
- `test_integration_fixed.py` - Fixed integration tests for Qdrant (5 tests, 40% passing)
- `test_mcp_qdrant_integration.py` - MCP-specific Qdrant integration tests (6 tests, 50% passing)
- `test_searxng_integration.py` - SearXNG search integration tests (9 tests)
## Running Tests
### Prerequisites
**Start Qdrant for testing:**
```bash
# Start Qdrant in Docker (required for integration tests)
# Note: No port mapping - only accessible from other Docker containers
docker run -d --name qdrant-test qdrant/qdrant
# Verify Qdrant is running (from inside Docker network)
docker exec qdrant-test curl http://localhost:6333/healthz
```
### Quick Start
```bash
# Run all unit tests (excluding integration) - OpenAI is mocked by default
uv run pytest tests/ -v -m "not integration"
# Run with coverage
uv run pytest tests/ -v --cov=src --cov-report=term-missing --cov-report=html
# Run specific test file
uv run pytest tests/test_utils.py -v
# Run only Qdrant tests (requires Qdrant running)
uv run pytest tests/test_qdrant_adapter.py tests/test_database_interface.py -v
```
### Running Tests with Real OpenAI API
By default, all OpenAI API calls are **mocked** to prevent accidental costs.
To run tests with real OpenAI (embeddings cost ~$0.0001 per test):
```bash
# Enable real OpenAI embeddings (cheap: ~$0.02 per 1M tokens)
ALLOW_OPENAI_TESTS=true uv run pytest tests/test_qdrant_integration.py -v
# Enable expensive LLM inference tests ($$$ - use with caution!)
ALLOW_EXPENSIVE_TESTS=true uv run pytest tests/ -v -m expensive
```
**Cost estimates:**
- Embeddings (`text-embedding-3-small`): ~$0.0001 per test
- Full test suite with real embeddings: < $0.001 (less than a penny)
- Expensive LLM tests (GPT-4): $0.01-$0.10 per test
### Using Test Scripts
```bash
# Run unit tests with coverage reporting
./scripts/run_tests_with_coverage.sh
# Run integration tests with Docker
./scripts/run_integration_tests.sh
```
## Test Coverage
### Current Coverage Status
- **Overall**: 56.71% (Target: 80%)
- **qdrant_adapter.py**: 73% ✅
- **utils.py**: 94% ✅
- **crawl4ai_mcp.py**: 53% ✅
- **database/base.py**: 100% ✅
- **database/factory.py**: 35% ⚠️
### Coverage Targets
- Overall: >80%
- Critical paths: >90% (Achieved for Qdrant)
- Error handling: >70%
### Coverage Reports
- Terminal: Shown after test runs
- HTML: Generated in `htmlcov/index.html`
- XML: Generated as `coverage.xml` for CI/CD
### Viewing HTML Coverage
```bash
# After running tests with coverage
open htmlcov/index.html # macOS
xdg-open htmlcov/index.html # Linux
start htmlcov/index.html # Windows
```
## Writing Tests
### Test Guidelines
1. Use descriptive test names that explain what is being tested
2. Follow the Arrange-Act-Assert pattern
3. Mock external dependencies (databases, APIs)
4. Test both success and failure cases
5. Include edge cases (empty inputs, large batches, special characters)
6. Ensure vector dimensions are 1536 for OpenAI embeddings
7. Use `asyncio.run_in_executor` for sync Qdrant client calls
8. Mock OpenAI API responses with proper embedding dimensions
### Example Test Structure
```python
@pytest.mark.asyncio
async def test_feature_success_case(self, mock_dependency):
"""Test that feature works correctly with valid input"""
# Arrange
input_data = {"key": "value"}
mock_dependency.method.return_value = "expected"
# Act
result = await function_under_test(input_data)
# Assert
assert result == "expected"
mock_dependency.method.assert_called_once_with(input_data)
```
## Test Fixtures
Common fixtures are defined in `conftest.py`:
- `get_adapter()` - Factory for creating database adapters
- `mock_openai_embeddings()` - Mock OpenAI API responses (1536 dimensions)
- `event_loop` - Async event loop for tests
- `mock_qdrant_client` - Mocked Qdrant client with proper async handling
- `mock_supabase_client` - Mocked Supabase client
- Database-specific mocks with stateful behavior
## Continuous Integration
Tests run automatically on:
- Push to main/develop branches
- Pull requests
- Manual workflow dispatch
See `.github/workflows/test-coverage.yml` for CI configuration.
## Recent Improvements (August 2025)
### Major Fixes
1. **Qdrant Adapter**: Fixed all test failures (19/19 passing)
- Vector dimensions updated to 1536
- Missing methods implemented
- Batch processing fixed
- Interface compliance achieved
2. **Test Coverage**: Improved from 13.54% to 56.71%
- Added 77 comprehensive MCP server tests
- Fixed utils test failures
- Achieved 94% coverage for utils module
3. **Integration Tests**: Fixed async/sync issues
- Created `test_integration_fixed.py`
- Resolved event loop conflicts
- Fixed parameter naming issues
## Troubleshooting
### Common Issues
1. **Import errors**: Ensure you're running tests from project root with `uv run`
2. **Async warnings**: Use `@pytest.mark.asyncio` decorator
3. **Mock not working**: Check mock patch paths match actual imports
4. **Coverage missing files**: Check `pytest.ini` for exclusions
5. **Vector dimension errors**: Ensure all embeddings use 1536 dimensions
6. **Docker port conflicts**: Stop existing containers or use different ports
7. **Event loop errors**: Use `asyncio.run_in_executor` for sync Qdrant calls
8. **Parameter mismatches**: Check `filter_metadata` vs `metadata_filter` naming
### Debug Mode
```bash
# Run with verbose output and show print statements
pytest tests/ -v -s
# Run with debugger on failure
pytest tests/ --pdb
```
## Next Steps to Reach 80% Coverage
To achieve the remaining 23.29% coverage needed:
1. **Expand MCP Server Tests** (~15% gain)
- Add tests for error paths in `crawl4ai_mcp.py`
- Cover environment variable configurations
- Test complex crawling scenarios
2. **Fix Integration Test Issues** (~5% gain)
- Resolve Docker port conflicts
- Use existing containers instead of starting new ones
- Fix remaining async event loop issues
3. **Improve Database Factory Coverage** (~3% gain)
- Add tests for uncovered factory methods
- Test error scenarios and edge cases
- Cover all adapter initialization paths
### Key Areas Needing Tests
- Error handling in MCP tools
- Edge cases for batch operations
- Environment-specific configurations
- Complex crawling scenarios (sitemaps, recursive)
- Database connection error recovery