Skip to main content
Glama
testing.md19.2 kB
# ThoughtMCP Cognitive Architecture - Testing Guide This comprehensive guide covers testing strategies, best practices, and workflows for the ThoughtMCP cognitive architecture. Testing is mandatory and follows strict Test-Driven Development (TDD) principles. ## Testing Philosophy **Tests are guardrails that prevent incorrect implementations.** ### Core Principles 1. **Tests First**: Write failing tests before implementation 2. **Comprehensive Coverage**: 95%+ line coverage, 90%+ branch coverage 3. **Test Behavior**: Focus on what, not how 4. **Keep Tests Simple**: Tests should be easier to understand than code 5. **Fast Feedback**: Tests should run quickly (<2 minutes for full suite) 6. **Isolated Tests**: Each test should be independent 7. **Realistic Tests**: No mocks for core functionality ## Test Organization ``` src/__tests__/ ├── setup/ # Global test setup │ ├── global-setup.ts # Test initialization │ ├── global-teardown.ts # Test cleanup │ └── test-environment.ts # Test environment config ├── utils/ # Test utilities │ ├── test-database.ts # Database test helpers │ ├── test-fixtures.ts # Test data factories │ ├── mock-embeddings.ts # Embedding mocks │ ├── assertions.ts # Custom assertions │ └── test-helpers.ts # General helpers ├── templates/ # Test templates │ ├── unit-test.template.ts │ ├── integration-test.template.ts │ ├── e2e-test.template.ts │ ├── performance-test.template.ts │ └── accuracy-test.template.ts ├── unit/ # Unit tests (mirrors src/ structure) │ ├── bias/ │ ├── confidence/ │ ├── database/ │ ├── embeddings/ │ ├── emotion/ │ ├── framework/ │ ├── graph/ │ ├── memory/ │ ├── metacognitive/ │ ├── monitoring/ │ ├── reasoning/ │ ├── search/ │ ├── security/ │ ├── server/ │ ├── temporal/ │ └── utils/ ├── integration/ # Integration tests │ ├── memory-lifecycle.test.ts │ ├── reasoning-workflow.test.ts │ └── migration-process.test.ts ├── e2e/ # End-to-end tests │ └── mcp-tools.test.ts ├── performance/ # Performance tests │ ├── embedding-generation.perf.test.ts │ ├── memory-retrieval.perf.test.ts │ ├── parallel-reasoning.perf.test.ts │ └── search-operations.perf.test.ts ├── production/ # Production readiness tests │ ├── production-environment.test.ts │ └── production-readiness.test.ts └── validation/ # Accuracy validation tests ├── accuracy-validation.test.ts └── framework-validation.test.ts ``` ## Test Types ### 1. Unit Tests Test individual components in isolation. **Location**: `src/__tests__/unit/` **Characteristics**: - Fast execution (<100ms per test) - No external dependencies - Mock external services **Example**: ```typescript import { describe, it, expect, beforeEach } from "vitest"; import { BiasPatternRecognizer } from "../../../bias/bias-pattern-recognizer"; describe("BiasPatternRecognizer", () => { let recognizer: BiasPatternRecognizer; beforeEach(() => { recognizer = new BiasPatternRecognizer(); }); it("should detect confirmation bias in reasoning chain", () => { const reasoning = { steps: [ { type: "evidence", content: "Supporting evidence only" }, { type: "conclusion", content: "Confirms initial hypothesis" }, ], }; const biases = recognizer.detectBiases(reasoning); expect(biases).toContainEqual(expect.objectContaining({ type: "confirmation" })); }); }); ``` ### 2. Integration Tests Test interactions between components. **Location**: `src/__tests__/integration/` **Characteristics**: - Moderate execution time (<1s per test) - Uses real database (test instance) - Tests component interactions **Example**: ```typescript import { describe, it, expect, beforeAll, afterAll } from "vitest"; import { MemoryRepository } from "../../memory/memory-repository"; import { EmbeddingEngine } from "../../embeddings/embedding-engine"; import { setupTestDatabase, clearTestData } from "../utils/test-database"; describe("Memory Lifecycle Integration", () => { let db: DatabaseConnectionManager; let repo: MemoryRepository; beforeAll(async () => { db = await setupTestDatabase(); repo = new MemoryRepository(db, new EmbeddingEngine()); }); afterAll(async () => { await clearTestData(db); await db.disconnect(); }); it("should create, retrieve, and delete memory", async () => { const memory = await repo.create({ content: "Integration test memory", userId: "test-user", }); const retrieved = await repo.retrieve(memory.id); expect(retrieved?.content).toBe("Integration test memory"); await repo.delete(memory.id); const deleted = await repo.retrieve(memory.id); expect(deleted).toBeNull(); }); }); ``` ### 3. End-to-End Tests Test complete workflows from user perspective. **Location**: `src/__tests__/e2e/` **Characteristics**: - Slower execution (<5s per test) - Tests full MCP tool workflows - Validates end-to-end functionality ### 4. Performance Tests Validate performance requirements. **Location**: `src/__tests__/performance/` **Characteristics**: - Tests latency targets (p50, p95, p99) - Tests scalability - Measures throughput **Example**: ```typescript import { describe, it, expect } from "vitest"; import { MemoryRepository } from "../../memory/memory-repository"; describe("Memory Retrieval Performance", () => { it("should retrieve memories within latency targets", async () => { const latencies: number[] = []; for (let i = 0; i < 100; i++) { const start = performance.now(); await repo.search({ text: "test query", limit: 10 }); latencies.push(performance.now() - start); } latencies.sort((a, b) => a - b); const p50 = latencies[Math.floor(latencies.length * 0.5)]; const p95 = latencies[Math.floor(latencies.length * 0.95)]; const p99 = latencies[Math.floor(latencies.length * 0.99)]; expect(p50).toBeLessThan(100); // <100ms expect(p95).toBeLessThan(200); // <200ms expect(p99).toBeLessThan(500); // <500ms }); }); ``` ### 5. Accuracy Tests Validate cognitive accuracy requirements. **Location**: `src/__tests__/validation/` **Characteristics**: - Tests calibration error (±10%) - Tests bias detection rate (>70%) - Tests emotion detection accuracy (>75%) - Tests framework selection accuracy (>80%) ## Running Tests ```bash # Run all tests npm test # Run specific test file npm test -- memory-repository.test.ts # Run tests matching pattern npm test -- --grep "embedding" # Run with coverage npm run test:coverage # Run in watch mode npm run test:watch # Test categories npm run test:unit # Unit tests only npm run test:integration # Integration tests only npm run test:e2e # End-to-end tests only npm run test:performance # Performance tests npm run test:accuracy # Accuracy tests ``` ## Test Utilities ### Test Database (`src/__tests__/utils/test-database.ts`) ```typescript import { setupTestDatabase, clearTestData, seedTestMemories } from "../utils/test-database"; // Setup test database const db = await setupTestDatabase(); // Clear all test data await clearTestData(db); // Seed test memories await seedTestMemories(db, 100); ``` ### Test Fixtures (`src/__tests__/utils/test-fixtures.ts`) ```typescript import { createTestMemory, createTestEmbedding } from "../utils/test-fixtures"; // Create test memory with defaults const memory = createTestMemory({ content: "Custom content" }); // Create test embedding const embedding = createTestEmbedding(768); ``` ### Mock Embeddings (`src/__tests__/utils/mock-embeddings.ts`) The mock embeddings system provides deterministic embeddings for testing without requiring a running Ollama server. It supports two modes: 1. **Cached mode**: Uses pre-computed embeddings from real Ollama server 2. **Generated mode**: Falls back to deterministic hash-based generation ```typescript import { MockOllamaEmbeddingModel } from "../utils/mock-embeddings"; // Use mock for deterministic testing const model = new MockOllamaEmbeddingModel({ host: "http://localhost:11434", modelName: "nomic-embed-text", dimension: 768, }); const embedding = await model.generate("test text"); // Check cache statistics const stats = model.getCacheStats(); console.log(`Cache hits: ${stats.hits}, misses: ${stats.misses}`); ``` ### Hybrid Embedding Model For tests that should use real Ollama when available but fall back to mocks: ```typescript import { createHybridEmbeddingModel, getUnitTestModel, getIntegrationTestModel, } from "../utils/mock-embeddings"; // Unit tests: Always use mock (fast, deterministic) const unitModel = getUnitTestModel(); // Integration tests: Use real Ollama if available const { model, mode, reason } = await getIntegrationTestModel(); console.log(`Using ${mode} mode: ${reason}`); ``` ### Capturing Real Ollama Embeddings To ensure consistency between mock and real embeddings, capture real Ollama embeddings: ```bash # Start Ollama server ollama serve # Pull the embedding model ollama pull nomic-embed-text # Capture embeddings for common test strings npm run test:capture-embeddings ``` This generates `src/__tests__/utils/ollama-embeddings-cache.json` with pre-computed embeddings that the mock will use for matching test strings. ### Environment Variables Control embedding behavior with environment variables: - `USE_MOCK_EMBEDDINGS=true`: Always use mock embeddings - `USE_REAL_OLLAMA=true`: Always use real Ollama (will fail if unavailable) - Neither set: Auto-detect Ollama availability ## Coverage Requirements - **Line Coverage**: 95%+ - **Branch Coverage**: 90%+ - **Function Coverage**: 95%+ - **Statement Coverage**: 95%+ ```bash # Generate coverage report npm run test:coverage # View HTML report open development/reports/coverage/index.html ``` ## Best Practices ### 1. Test Naming ```typescript // Good: Descriptive, behavior-focused it("should create memory with five-sector embeddings"); it("should throw ValidationError for invalid user ID"); // Bad: Vague, implementation-focused it("works"); it("test create"); ``` ### 2. Test Independence ```typescript // Good: Each test is independent beforeEach(() => { repo = new MemoryRepository(); }); // Bad: Tests depend on each other let memory: Memory; it('should create', () => { memory = ... }); it('should retrieve', () => { repo.retrieve(memory.id) }); ``` ### 3. Realistic Test Data ```typescript // Good: Realistic data const memory = { content: "Met with Sarah to discuss Q4 roadmap", userId: "user-abc123", importance: 0.8, }; // Bad: Minimal data const memory = { content: "test", userId: "1" }; ``` ### 4. Clear Assertions ```typescript // Good: Clear expectations expect(memory.embeddings.episodic).toHaveLength(768); expect(memory.strength).toBeGreaterThanOrEqual(0); // Bad: Vague assertions expect(memory).toBeDefined(); ``` ### 5. Test Error Cases ```typescript describe("MemoryRepository.create", () => { it("should create valid memory", async () => { // Happy path }); it("should throw ValidationError for empty content", async () => { await expect(repo.create({ content: "" })).rejects.toThrow(ValidationError); }); }); ``` ## Mocking External Services Tests must not depend on external services. Use mocks for isolation: ```typescript // ❌ BAD: Depends on external Ollama service const model = new OllamaEmbeddingModel({ host: process.env.OLLAMA_HOST, }); // ✅ GOOD: Uses deterministic mock import { MockOllamaEmbeddingModel } from "../utils/mock-embeddings"; const model = new MockOllamaEmbeddingModel({ host: "http://localhost:11434", modelName: "nomic-embed-text", dimension: 768, }); ``` ## Test Container Management ThoughtMCP uses Docker containers for PostgreSQL (with pgvector) and Ollama during testing. The `TestContainerManager` provides automatic container lifecycle management, eliminating the need for manual container setup. ### Automatic Container Management By default, test containers start automatically when you run tests and stop when tests complete. This is controlled by the `AUTO_START_CONTAINERS` environment variable. ```bash # Run tests with automatic container management (default) npm test # Explicitly enable auto-start AUTO_START_CONTAINERS=true npm test # Disable auto-start (use manually started containers) AUTO_START_CONTAINERS=false npm test ``` ### How It Works The `TestContainerManager` orchestrates container lifecycle using Docker Compose as the single source of truth: 1. **Startup Phase** (in `global-setup.ts`): - Checks if `AUTO_START_CONTAINERS` is enabled (default: true) - Verifies Docker and Docker Compose availability - Checks for existing running containers - Allocates dynamic ports if defaults are occupied - Starts containers via `docker compose -f docker-compose.test.yml up` - Waits for health checks to pass - Configures test environment variables 2. **Teardown Phase** (in `global-teardown.ts`): - Checks `KEEP_CONTAINERS_RUNNING` setting - Stops containers that were started by the manager - Releases allocated ports - Handles SIGINT/SIGTERM for graceful cleanup ### Docker Compose Configuration Test containers are defined in `docker-compose.test.yml`: ```yaml services: postgres-test: image: pgvector/pgvector:pg16 ports: - "${TEST_DB_PORT:-5433}:5432" healthcheck: test: ["CMD-SHELL", "pg_isready -U thoughtmcp_test"] interval: 5s timeout: 3s retries: 10 ollama-test: image: ollama/ollama:latest ports: - "${TEST_OLLAMA_PORT:-11435}:11434" healthcheck: test: ["CMD", "ollama", "list"] interval: 5s timeout: 10s retries: 15 ``` All configuration values use environment variable substitution, allowing customization via `.env` files. ### Dynamic Port Allocation When default ports are occupied (e.g., development containers running), the `TestContainerManager` automatically finds available ports: - **PostgreSQL**: Default 5433, fallback range 5434-5500 - **Ollama**: Default 11435, fallback range 11436-11500 The allocated ports are passed to Docker Compose via environment variables (`TEST_DB_PORT`, `TEST_OLLAMA_PORT`) and configured in the test environment. ### Container Reuse If test containers are already running and healthy, the `TestContainerManager` reuses them instead of starting new ones. This speeds up repeated test runs during development. ```bash # Start containers manually (they'll be reused by tests) docker compose -f docker-compose.test.yml up -d # Run tests (containers are reused) npm test # Containers remain running after tests ``` ### Environment Variables | Variable | Default | Description | | --------------------------- | ----------------- | ------------------------------------- | | `AUTO_START_CONTAINERS` | `true` | Enable automatic container startup | | `KEEP_CONTAINERS_RUNNING` | `false` | Keep containers running after tests | | `CONTAINER_STARTUP_TIMEOUT` | `60` | Health check timeout in seconds | | `PRESERVE_TEST_DATA` | `false` | Preserve container volumes on cleanup | | `TEST_DB_PORT` | `5433` | PostgreSQL external port | | `TEST_OLLAMA_PORT` | `11435` | Ollama external port | | `TEST_CONTAINER_PREFIX` | `thoughtmcp-test` | Container name prefix | | `SKIP_DB_SETUP` | `false` | Skip database setup (for unit tests) | ### Manual Container Management For scenarios where you prefer manual control over containers: ```bash # 1. Disable auto-start export AUTO_START_CONTAINERS=false # 2. Start containers manually docker compose -f docker-compose.test.yml up -d # 3. Wait for health checks docker compose -f docker-compose.test.yml ps # 4. Run tests npm test # 5. Stop containers when done docker compose -f docker-compose.test.yml down ``` ### CI/CD Integration In CI environments, automatic container management works out of the box: ```yaml # GitHub Actions example jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: node-version: "20" - run: npm ci - run: npm test # Containers start automatically ``` For CI environments with pre-configured services: ```yaml # Using GitHub Actions services jobs: test: runs-on: ubuntu-latest services: postgres: image: pgvector/pgvector:pg16 ports: - 5433:5432 env: AUTO_START_CONTAINERS: false steps: - run: npm test ``` ### Troubleshooting Container Issues **Docker not available:** ``` ❌ Docker is not available for test containers Options: 1. Start Docker Desktop or Docker daemon 2. Set AUTO_START_CONTAINERS=false and start containers manually ``` **Port conflicts:** ``` Using dynamic port 5434 (default 5433 occupied) ``` The manager automatically finds available ports. Check what's using the default port: ```bash lsof -i :5433 ``` **Health check failures:** ```bash # Check container logs docker compose -f docker-compose.test.yml logs postgres-test docker compose -f docker-compose.test.yml logs ollama-test # Check container status docker compose -f docker-compose.test.yml ps ``` **Cleanup issues:** ```bash # Force cleanup of test containers docker compose -f docker-compose.test.yml down -v # Remove orphaned containers docker compose -f docker-compose.test.yml down --remove-orphans ``` ## Troubleshooting ### Tests Failing Unexpectedly ```bash # Clear test cache rm -rf node_modules/.vitest npm test # Reset test database npm run db:reset npm run db:setup ``` ### Flaky Tests ```typescript // Problem: Race condition it("should process async operation", async () => { processAsync(); // No await! expect(result).toBeDefined(); // Fails randomly }); // Solution: Proper async handling it("should process async operation", async () => { await processAsync(); expect(result).toBeDefined(); }); ``` ### Coverage Not Meeting Targets ```bash # Generate detailed coverage report npm run test:coverage # Add tests for uncovered code # Remove dead code ``` ## Additional Resources - **[Vitest Documentation](https://vitest.dev/)** - Test framework docs - **[Test Templates](../src/__tests__/templates/)** - Example templates - **[Development Guide](./development.md)** - Development workflow --- **Last Updated**: December 2025 **Version**: 0.5.0

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/keyurgolani/ThoughtMcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server