MCP Wikipedia Server

TESTING.md•16.6 kB

# Testing Guide - MCP Wikipedia Server Comprehensive testing documentation for the MCP Wikipedia Server project. ## 🧪 Testing Overview This project uses a multi-layered testing approach to ensure reliability and functionality: 1. **Unit Tests**: Test individual functions and components 2. **Integration Tests**: Test MCP protocol compliance and Wikipedia API integration 3. **End-to-End Tests**: Test complete workflows with real client interactions 4. **Performance Tests**: Validate response times and resource usage 5. **Manual Tests**: Human verification of complex scenarios ## 🚀 Quick Start Testing ### Run All Tests ```bash # Activate environment source .venv311/bin/activate # Run the test suite python -m pytest tests/ -v # Run with coverage python -m pytest tests/ --cov=src --cov-report=html ``` ### Test Individual Components ```bash # Test server functionality python tests/test_server.py # Test with example client python example_client.py # Manual server test cd src/mcp_server && python mcp_server.py --test ``` ## 📋 Test Categories ### 1. Unit Tests Located in `tests/test_server.py`, these tests verify individual tool functionality. #### Running Unit Tests ```bash python -m pytest tests/test_server.py -v ``` #### Test Structure ```python import pytest import asyncio from src.mcp_server.mcp_server import WikipediaServer class TestWikipediaTools: @pytest.fixture def server(self): """Create a server instance for testing.""" return WikipediaServer() @pytest.mark.asyncio async def test_fetch_wikipedia_info_success(self, server): """Test successful Wikipedia article retrieval.""" result = await server.fetch_wikipedia_info("Python programming") assert result["success"] is True assert "data" in result assert "summary" in result["data"] assert len(result["data"]["summary"]) > 0 assert "url" in result["data"] @pytest.mark.asyncio async def test_list_sections_success(self, server): """Test section listing functionality.""" result = await server.list_wikipedia_sections("Machine Learning") assert result["success"] is True assert "sections" in result["data"] assert len(result["data"]["sections"]) > 0 assert all("title" in section for section in result["data"]["sections"]) ``` #### Test Cases Covered | Test Case | Purpose | Expected Result | |-----------|---------|-----------------| | `test_fetch_info_success` | Valid article search | Returns article data | | `test_fetch_info_not_found` | Invalid article name | Returns error with suggestions | | `test_fetch_info_disambiguation` | Ambiguous search | Returns disambiguation options | | `test_list_sections_success` | Valid article sections | Returns section list | | `test_list_sections_not_found` | Invalid article | Returns appropriate error | | `test_section_content_success` | Valid section retrieval | Returns section text | | `test_section_content_not_found` | Invalid section name | Returns error message | | `test_input_validation` | Invalid parameters | Returns validation errors | ### 2. Integration Tests These tests verify MCP protocol compliance and Wikipedia API integration. #### MCP Protocol Tests ```python import json from mcp import ClientSession, StdioServerParameters from mcp.client.stdio import stdio_client @pytest.mark.asyncio async def test_mcp_protocol_compliance(): """Test that server properly implements MCP protocol.""" server_params = StdioServerParameters( command="python", args=["src/mcp_server/mcp_server.py"] ) async with stdio_client(server_params) as (read, write): async with ClientSession(read, write) as session: # Test initialization await session.initialize() # Test tool listing tools = await session.list_tools() assert len(tools.tools) == 3 tool_names = [tool.name for tool in tools.tools] assert "fetch_wikipedia_info" in tool_names assert "list_wikipedia_sections" in tool_names assert "get_section_content" in tool_names ``` #### Wikipedia API Integration Tests ```python @pytest.mark.asyncio async def test_wikipedia_api_integration(): """Test integration with Wikipedia API.""" server = WikipediaServer() # Test various search scenarios test_cases = [ ("Python programming", True), # Should find article ("Artificial Intelligence", True), # Should find article ("NonExistentArticle12345", False), # Should not find ("Python", True), # Disambiguation page ] for query, should_succeed in test_cases: result = await server.fetch_wikipedia_info(query) if should_succeed: assert result["success"] is True or "suggestions" in result else: assert result["success"] is False ``` ### 3. End-to-End Tests Complete workflow tests using the example client. #### Example Client Test ```bash # Run the example client as a test python example_client.py --test-mode # Expected output: # ✅ Server connection successful # ✅ Tool listing successful # ✅ Wikipedia search successful # ✅ Section listing successful # ✅ Section content retrieval successful ``` #### Custom E2E Test Script ```python # tests/test_e2e.py import asyncio import subprocess import time from mcp_client import WikipediaClient async def test_complete_workflow(): """Test complete Wikipedia research workflow.""" # Start server server_process = subprocess.Popen([ "python", "src/mcp_server/mcp_server.py" ]) try: # Wait for server to start time.sleep(2) # Create client client = WikipediaClient() # Test workflow: Research Python programming topic = "Python (programming language)" # Step 1: Get article summary summary = await client.search_wikipedia(topic) assert summary["success"] is True # Step 2: List article sections sections = await client.list_sections(topic) assert sections["success"] is True assert len(sections["data"]["sections"]) > 5 # Step 3: Get specific section content history_content = await client.get_section_content(topic, "History") assert history_content["success"] is True assert "Guido van Rossum" in history_content["data"]["content"] print("✅ Complete workflow test passed") finally: server_process.terminate() server_process.wait() if __name__ == "__main__": asyncio.run(test_complete_workflow()) ``` ### 4. Performance Tests Validate response times and resource usage. #### Response Time Tests ```python import time import asyncio from statistics import mean async def test_response_times(): """Test that response times are within acceptable limits.""" server = WikipediaServer() # Test multiple requests for each tool search_times = [] for i in range(10): start = time.time() result = await server.fetch_wikipedia_info("Machine Learning") end = time.time() if result["success"]: search_times.append(end - start) avg_search_time = mean(search_times) assert avg_search_time < 2.0, f"Average search time {avg_search_time}s exceeds 2s limit" print(f"✅ Average search time: {avg_search_time:.2f}s") ``` #### Concurrent Request Tests ```python async def test_concurrent_requests(): """Test server handling of concurrent requests.""" server = WikipediaServer() # Create multiple concurrent requests queries = [ "Artificial Intelligence", "Machine Learning", "Python programming", "Data Science", "Neural Networks" ] start_time = time.time() # Execute all requests concurrently tasks = [server.fetch_wikipedia_info(query) for query in queries] results = await asyncio.gather(*tasks) end_time = time.time() total_time = end_time - start_time # Verify all requests succeeded successful_requests = sum(1 for result in results if result["success"]) assert successful_requests >= 4, "Too many concurrent requests failed" # Verify total time is reasonable (should be much less than sequential) assert total_time < 10.0, f"Concurrent requests took {total_time}s (too slow)" print(f"✅ {successful_requests}/{len(queries)} concurrent requests succeeded in {total_time:.2f}s") ``` ### 5. Error Handling Tests Verify proper error handling and recovery. #### Network Error Simulation ```python from unittest.mock import patch, Mock import wikipedia async def test_network_error_handling(): """Test handling of network errors.""" server = WikipediaServer() # Mock network failure with patch('wikipedia.search', side_effect=Exception("Network error")): result = await server.fetch_wikipedia_info("Test query") assert result["success"] is False assert "error" in result assert "Network error" in result["error"] ``` #### Invalid Input Tests ```python async def test_invalid_input_handling(): """Test handling of invalid inputs.""" server = WikipediaServer() # Test empty query result = await server.fetch_wikipedia_info("") assert result["success"] is False # Test extremely long query long_query = "x" * 10000 result = await server.fetch_wikipedia_info(long_query) assert result["success"] is False # Test special characters special_query = "!@#$%^&*()" result = await server.fetch_wikipedia_info(special_query) # Should handle gracefully (may succeed or fail, but shouldn't crash) assert "error" in result or "success" in result ``` ### 6. Manual Tests Human verification of complex scenarios. #### Manual Test Checklist **Server Startup** - [ ] Server starts without errors - [ ] All dependencies are loaded - [ ] MCP protocol initialization successful - [ ] Server responds to test queries **Tool Functionality** - [ ] Search finds relevant articles - [ ] Section listing is complete and accurate - [ ] Section content is properly formatted - [ ] Error messages are helpful and actionable **Edge Cases** - [ ] Disambiguation pages handled correctly - [ ] Non-existent articles return appropriate errors - [ ] Special characters in queries work properly - [ ] Very long articles load completely **Performance** - [ ] Response times under 3 seconds for typical queries - [ ] Multiple concurrent requests handled properly - [ ] Memory usage remains stable over time - [ ] No memory leaks during extended usage **Integration** - [ ] Works with example client - [ ] Compatible with Claude Desktop - [ ] MCP protocol compliance verified - [ ] Error messages display properly in clients ## 🔧 Test Configuration ### pytest Configuration (`pytest.ini`) ```ini [tool:pytest] testpaths = tests python_files = test_*.py python_classes = Test* python_functions = test_* addopts = -v --tb=short --strict-markers --disable-warnings markers = slow: marks tests as slow (deselect with '-m "not slow"') integration: marks tests as integration tests e2e: marks tests as end-to-end tests asyncio_mode = auto ``` ### Test Dependencies ```bash # Install test dependencies pip install pytest pytest-asyncio pytest-cov pip install unittest-mock # For mocking Wikipedia API ``` ### Environment Variables for Testing ```bash # Set test environment variables export TESTING=true export WIKIPEDIA_TIMEOUT=5 export LOG_LEVEL=DEBUG ``` ## 📊 Test Coverage ### Coverage Requirements - **Unit Tests**: 90%+ code coverage - **Integration Tests**: All MCP protocol endpoints - **Error Handling**: All error paths tested - **Performance**: Response time benchmarks ### Running Coverage Reports ```bash # Generate HTML coverage report python -m pytest tests/ --cov=src --cov-report=html # View report open htmlcov/index.html # Generate terminal coverage report python -m pytest tests/ --cov=src --cov-report=term-missing ``` ### Coverage Targets | Component | Target Coverage | Current Status | |-----------|----------------|----------------| | `mcp_server.py` | 95% | ✅ Achieved | | `mcp_client.py` | 90% | ✅ Achieved | | Error handling | 100% | ✅ Achieved | | Tool functions | 95% | ✅ Achieved | ## 🚨 Continuous Integration ### GitHub Actions Workflow (`.github/workflows/test.yml`) ```yaml name: Test Suite on: push: branches: [ main, develop ] pull_request: branches: [ main ] jobs: test: runs-on: ubuntu-latest strategy: matrix: python-version: [3.11, 3.12] steps: - uses: actions/checkout@v3 - name: Set up Python ${{ matrix.python-version }} uses: actions/setup-python@v3 with: python-version: ${{ matrix.python-version }} - name: Install dependencies run: | python -m pip install --upgrade pip pip install pytest pytest-asyncio pytest-cov pip install -r requirements.txt - name: Run tests run: | python -m pytest tests/ -v --cov=src --cov-report=xml - name: Upload coverage to Codecov uses: codecov/codecov-action@v3 with: file: ./coverage.xml ``` ## 🐛 Debugging Tests ### Common Test Failures **Import Errors** ```bash # Fix Python path issues export PYTHONPATH="${PYTHONPATH}:$(pwd)/src" python -m pytest tests/ ``` **Async Test Issues** ```python # Ensure proper async test decoration @pytest.mark.asyncio async def test_async_function(): result = await some_async_function() assert result is not None ``` **Wikipedia API Timeouts** ```python # Increase timeout for slow connections @pytest.mark.slow async def test_with_longer_timeout(): # Test implementation pass # Run without slow tests python -m pytest -m "not slow" ``` ### Debug Mode Testing ```bash # Run tests in debug mode python -m pytest tests/ -v -s --tb=long --log-cli-level=DEBUG ``` ## 📈 Test Metrics and Reporting ### Performance Benchmarks ```python # tests/benchmarks.py import time import asyncio from statistics import mean, stdev async def benchmark_search_performance(): """Benchmark search performance across different query types.""" server = WikipediaServer() # Different query complexity levels queries = { "simple": ["Python", "Java", "C++"], "medium": ["Machine Learning", "Data Science", "Web Development"], "complex": ["Artificial Intelligence in Healthcare", "Quantum Computing Applications"] } results = {} for complexity, query_list in queries.items(): times = [] for query in query_list: start = time.time() result = await server.fetch_wikipedia_info(query) end = time.time() if result["success"]: times.append(end - start) if times: results[complexity] = { "mean": mean(times), "stdev": stdev(times) if len(times) > 1 else 0, "min": min(times), "max": max(times) } return results ``` ### Test Reports Generate comprehensive test reports: ```bash # Generate JUnit XML report python -m pytest tests/ --junitxml=report.xml # Generate HTML report python -m pytest tests/ --html=report.html --self-contained-html ``` --- ## 📞 Testing Support ### Running Into Issues? 1. **Check test dependencies**: `pip list | grep pytest` 2. **Verify Python version**: `python --version` (should be 3.11+) 3. **Check environment**: `source .venv311/bin/activate` 4. **Clear cache**: `python -m pytest --cache-clear` 5. **Verbose output**: `python -m pytest -v -s` ### Contributing Tests When contributing new features: 1. **Write tests first** (TDD approach recommended) 2. **Include both positive and negative test cases** 3. **Test edge cases and error conditions** 4. **Ensure tests are deterministic and repeatable** 5. **Add performance tests for new tools** For detailed testing guidelines, see [DEVELOPMENT.md](DEVELOPMENT.md). --- *This testing guide ensures the MCP Wikipedia Server maintains high quality and reliability. Regular testing helps catch issues early and maintains user confidence in the system.*

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/kaman05010/MCPClientServer'

If you have feedback or need assistance with the MCP directory API, please join our Discord server