Skip to main content
Glama

MCP Document Indexer

by yairwein
README.md7.56 kB
# Test Suite for MCP Document Indexer This directory contains a comprehensive test suite for the MCP Document Indexer project, including unit tests, integration tests, and MCP-specific functionality tests. ## Test Structure ``` tests/ ├── conftest.py # Test configuration and fixtures ├── unit/ # Unit tests │ ├── test_parser.py # Document parser tests │ ├── test_llm.py # LLM and processor tests │ └── test_indexer.py # Document indexer tests ├── integration/ # Integration tests │ └── test_tools.py # MCP tools integration tests ├── mcp/ # MCP-specific tests │ ├── test_mcp_server.py # MCP server functionality │ └── test_mcp_protocol.py # MCP protocol compliance └── test_end_to_end.py # End-to-end integration tests ``` ## Test Categories ### Unit Tests (`tests/unit/`) - **Parser Tests**: Test document parsing, chunking, and metadata extraction - **LLM Tests**: Test LLM integration, document processing, and summarization - **Indexer Tests**: Test vector indexing, search, and database operations ### Integration Tests (`tests/integration/`) - **Tools Tests**: Test MCP tools with real components integrated together - Test document lifecycle from parsing through indexing to search ### MCP Tests (`tests/mcp/`) - **Server Tests**: Test MCP server setup, tool registration, and lifecycle - **Protocol Tests**: Test MCP protocol compliance, parameter validation, and error handling ### End-to-End Tests (`test_end_to_end.py`) - Complete workflow tests from document ingestion to search - Performance and memory usage benchmarks - Concurrent operation testing - Database persistence testing ## Running Tests ### Quick Start ```bash # Run all tests make test # Run specific test categories make test-unit # Unit tests only make test-integration # Integration tests only make test-mcp # MCP tests only # Run fast tests (skip slow external service tests) make test-fast # Run with coverage make test-all ``` ### Detailed Test Commands ```bash # Unit tests for specific components uv run pytest tests/unit/test_parser.py -v uv run pytest tests/unit/test_llm.py -v uv run pytest tests/unit/test_indexer.py -v # Integration tests uv run pytest tests/integration/ -v # MCP functionality tests uv run pytest tests/mcp/ -v # End-to-end tests uv run pytest tests/test_end_to_end.py -v # Run tests with specific markers uv run pytest -m "unit" -v # Only unit tests uv run pytest -m "integration" -v # Only integration tests uv run pytest -m "mcp" -v # Only MCP tests uv run pytest -m "slow" -v # Only slow tests uv run pytest -m "not slow" -v # Skip slow tests ``` ### Test Markers Tests are organized with pytest markers: - `@pytest.mark.unit` - Unit tests (fast, isolated) - `@pytest.mark.integration` - Integration tests (moderate speed) - `@pytest.mark.mcp` - MCP-specific functionality tests - `@pytest.mark.slow` - Tests requiring external services (Ollama, etc.) ## Test Configuration ### Environment Setup Tests use temporary directories and isolated configurations to avoid interfering with production data: - Each test gets a fresh temporary directory - Test databases are created in temp locations - LLM tests can run with mock responses when Ollama is unavailable ### Fixtures Key fixtures provided in `conftest.py`: - `test_config` - Isolated test configuration - `document_parser` - Document parser instance - `llm` - LLM instance (with fallback for offline testing) - `document_processor` - Document processor with LLM - `document_indexer` - Vector database indexer - `document_tools` - MCP tools with all components - `sample_text_file` - Sample text document - `sample_legal_file` - Sample legal document (NDA) - `multiple_test_files` - Collection of test documents ### Dependencies Some tests require external services: - **Ollama** - For LLM functionality tests (marked as `slow`) - **Sentence Transformers** - For embedding generation - **LanceDB** - For vector storage (uses temp databases) Tests gracefully handle missing dependencies with appropriate fallbacks or skips. ## MCP-Specific Testing ### Parameter Validation Testing Tests ensure MCP tools properly validate input parameters: ```python # Valid parameters search_input = SearchDocumentsInput(query="test", limit=5) # Invalid parameters should raise ValidationError with pytest.raises(ValidationError): SearchDocumentsInput(query="", limit=-1) ``` ### Protocol Compliance Testing Tests verify MCP protocol compliance: - JSON serialization/deserialization - Consistent response formats - Error handling and reporting - Parameter type coercion ### Server Integration Testing Tests verify MCP server functionality: - Tool registration and discovery - Context handling - Concurrent request handling - Resource cleanup ## Performance Testing ### Benchmarks End-to-end tests include performance benchmarks: - Document indexing speed - Search operation latency - Memory usage monitoring - Concurrent operation handling ### Load Testing Tests verify system behavior under load: - Multiple concurrent indexing operations - Concurrent MCP tool requests - Large document handling - Database performance ## Debugging Tests ### Verbose Output ```bash # Run with verbose output and no capture uv run pytest -v -s --tb=long # Debug specific test uv run pytest tests/unit/test_parser.py::TestDocumentParser::test_parse_text_file -v -s ``` ### Test Isolation ```bash # Run single test method uv run pytest tests/mcp/test_mcp_server.py::TestMCPServer::test_mcp_server_setup -v # Run tests matching pattern uv run pytest -k "legal" -v ``` ### Failure Analysis ```bash # Re-run only failed tests make test-failed # Show local variables on failure uv run pytest --tb=long --showlocals ``` ## Contributing Tests ### Adding New Tests 1. **Unit Tests**: Add to appropriate file in `tests/unit/` 2. **Integration Tests**: Add to `tests/integration/` 3. **MCP Tests**: Add to `tests/mcp/` 4. **End-to-End Tests**: Add to `tests/test_end_to_end.py` ### Test Guidelines 1. **Use appropriate markers** (`@pytest.mark.unit`, etc.) 2. **Use descriptive test names** that explain what is being tested 3. **Test both success and failure cases** 4. **Use fixtures** for common setup 5. **Mock external dependencies** in unit tests 6. **Test error handling** and edge cases 7. **Keep tests isolated** and independent ### Test Naming Convention ```python class TestComponentName: def test_specific_functionality(self): """Test description of what this test verifies.""" pass def test_error_case_description(self): """Test specific error condition handling.""" pass ``` ## Test Data Test files and content are generated dynamically to avoid committing large test files: - Text documents with various content types - Legal documents (NDAs, contracts) - Technical documentation - Large documents for performance testing All test data is created in temporary directories and cleaned up automatically. ## Continuous Integration The test suite is designed for CI environments: - Fast test subset for quick feedback - Comprehensive test suite for full validation - Proper timeout handling for slow tests - Clear failure reporting and debugging information Use `make ci-test` for CI-optimized test runs with failure limits and concise output.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/yairwein/document-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server