Skip to main content
Glama

MCP Memory Service

regression-tests.md17.9 kB
# Regression Tests This document provides structured test scenarios for validating critical functionality and preventing regressions. Each test includes setup instructions, expected results, evidence collection, and pass/fail criteria. ## Purpose Regression tests ensure that: - Critical bugs don't reappear after being fixed - Performance optimizations don't degrade over time - Platform-specific issues are caught before release - Integration points (MCP, HTTP API, storage backends) work correctly ## Test Categories 1. [Database Locking & Concurrency](#database-locking--concurrency) 2. [Storage Backend Integrity](#storage-backend-integrity) 3. [Dashboard Performance](#dashboard-performance) 4. [Tag Filtering Correctness](#tag-filtering-correctness) 5. [MCP Protocol Compliance](#mcp-protocol-compliance) --- ## Database Locking & Concurrency ### Test 1: Concurrent MCP Server Startup **Context:** v8.9.0+ fixed "database is locked" errors by setting SQLite pragmas before connection **Setup:** 1. Close all Claude Desktop instances 2. Ensure SQLite database exists at `~/Library/Application Support/mcp-memory/sqlite_vec.db` (macOS) 3. Verify `.env` contains `MCP_MEMORY_SQLITE_PRAGMAS=busy_timeout=15000,journal_mode=WAL` **Execution:** 1. Open 3 Claude Desktop instances simultaneously (within 5 seconds) 2. In each instance, trigger memory service initialization: ``` /mcp # Wait for MCP servers to connect # Try storing a memory in each instance ``` 3. Monitor logs in `~/Library/Logs/Claude/mcp-server-memory.log` **Expected Results:** - ✅ All 3 instances connect successfully - ✅ Zero "database is locked" errors in logs - ✅ All instances show healthy status via `/api/health` - ✅ Memory operations work in all instances **Evidence Collection:** ```bash # Check for lock errors grep -i "database is locked" ~/Library/Logs/Claude/mcp-server-memory.log # Verify pragma settings sqlite3 ~/Library/Application\ Support/mcp-memory/sqlite_vec.db "PRAGMA busy_timeout;" # Expected output: 15000 # Check journal mode sqlite3 ~/Library/Application\ Support/mcp-memory/sqlite_vec.db "PRAGMA journal_mode;" # Expected output: wal ``` **Pass Criteria:** - ✅ Zero lock errors - ✅ All servers initialize within 10 seconds - ✅ Concurrent memory operations succeed - ❌ FAIL if any server shows "database is locked" --- ### Test 2: Concurrent Memory Operations **Context:** Test simultaneous read/write operations from multiple clients **Setup:** 1. Start HTTP server: `uv run memory server --http` 2. Verify server is healthy: `curl http://127.0.0.1:8000/api/health` **Execution:** 1. Run concurrent memory stores from multiple terminals: ```bash # Terminal 1 for i in {1..50}; do curl -X POST http://127.0.0.1:8000/api/memories \ -H "Content-Type: application/json" \ -d "{\"content\":\"Test memory $i from terminal 1\",\"tags\":[\"test\",\"concurrent\"]}" done # Terminal 2 (run simultaneously) for i in {1..50}; do curl -X POST http://127.0.0.1:8000/api/memories \ -H "Content-Type: application/json" \ -d "{\"content\":\"Test memory $i from terminal 2\",\"tags\":[\"test\",\"concurrent\"]}" done ``` 2. While stores are running, perform searches: ```bash # Terminal 3 for i in {1..20}; do curl -s "http://127.0.0.1:8000/api/search" \ -H "Content-Type: application/json" \ -d '{"query":"test memory","limit":10}' done ``` **Expected Results:** - ✅ All 100 memory stores complete successfully - ✅ Zero HTTP 500 errors - ✅ Search operations return results during writes - ✅ No database lock errors in server logs **Evidence Collection:** ```bash # Count successful stores curl -s "http://127.0.0.1:8000/api/search/by-tag" \ -H "Content-Type: application/json" \ -d '{"tags":["concurrent"],"limit":1000}' | jq '.memories | length' # Expected: 100 # Check server logs for errors tail -100 ~/Library/Logs/Claude/mcp-server-memory.log | grep -i error ``` **Pass Criteria:** - ✅ 100 memories stored successfully - ✅ Zero database lock errors - ✅ Zero HTTP 500 responses - ❌ FAIL if any operation times out or errors --- ## Storage Backend Integrity ### Test 3: Hybrid Backend Synchronization **Context:** Verify hybrid backend syncs SQLite → Cloudflare without data loss **Setup:** 1. Configure hybrid backend in `.env`: ```bash MCP_MEMORY_STORAGE_BACKEND=hybrid MCP_HYBRID_SYNC_INTERVAL=10 # Frequent sync for testing CLOUDFLARE_API_TOKEN=your-token CLOUDFLARE_ACCOUNT_ID=your-account CLOUDFLARE_D1_DATABASE_ID=your-db-id CLOUDFLARE_VECTORIZE_INDEX=mcp-memory-index ``` 2. Clear Cloudflare backend: `python scripts/database/clear_cloudflare.py --confirm` 3. Start server: `uv run memory server --http` **Execution:** 1. Store 10 test memories via API: ```bash for i in {1..10}; do curl -X POST http://127.0.0.1:8000/api/memories \ -H "Content-Type: application/json" \ -d "{\"content\":\"Hybrid test memory $i\",\"tags\":[\"hybrid-test\"]}" done ``` 2. Wait 30 seconds (3x sync interval) for background sync 3. Query Cloudflare backend directly: ```bash python scripts/sync/check_cloudflare_sync.py --tag hybrid-test ``` **Expected Results:** - ✅ All 10 memories present in SQLite (immediate) - ✅ All 10 memories synced to Cloudflare (within 30s) - ✅ Content hashes match between backends - ✅ No sync errors in server logs **Evidence Collection:** ```bash # Check SQLite count curl -s "http://127.0.0.1:8000/api/search/by-tag" \ -H "Content-Type: application/json" \ -d '{"tags":["hybrid-test"]}' | jq '.memories | length' # Check Cloudflare count python scripts/sync/check_cloudflare_sync.py --tag hybrid-test --count # Compare content hashes python scripts/sync/check_cloudflare_sync.py --tag hybrid-test --verify-hashes ``` **Pass Criteria:** - ✅ SQLite count == Cloudflare count - ✅ All content hashes match - ✅ Sync completes within 30 seconds - ❌ FAIL if any memory missing or hash mismatch --- ### Test 4: Storage Backend Switching **Context:** Verify switching backends doesn't corrupt existing data **Setup:** 1. Start with sqlite-vec backend, store 20 memories 2. Stop server 3. Configure hybrid backend, restart server 4. Verify all memories still accessible **Execution:** 1. **SQLite-vec phase:** ```bash export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec uv run memory server --http & SERVER_PID=$! # Store 20 memories for i in {1..20}; do curl -X POST http://127.0.0.1:8000/api/memories \ -H "Content-Type: application/json" \ -d "{\"content\":\"Backend switch test $i\",\"tags\":[\"switch-test\"]}" done kill $SERVER_PID ``` 2. **Switch to hybrid:** ```bash export MCP_MEMORY_STORAGE_BACKEND=hybrid export MCP_HYBRID_SYNC_INTERVAL=10 # Set Cloudflare credentials... uv run memory server --http & SERVER_PID=$! # Wait for startup sleep 5 ``` 3. **Verify data integrity:** ```bash curl -s "http://127.0.0.1:8000/api/search/by-tag" \ -H "Content-Type: application/json" \ -d '{"tags":["switch-test"]}' | jq '.memories | length' ``` **Expected Results:** - ✅ All 20 memories still accessible after switch - ✅ Memories begin syncing to Cloudflare - ✅ No data corruption or loss - ✅ Health check shows "hybrid" backend **Pass Criteria:** - ✅ 20 memories retrieved successfully - ✅ Backend reported as "hybrid" in health check - ✅ No errors during backend initialization - ❌ FAIL if any memory inaccessible or corrupted --- ## Dashboard Performance ### Test 5: Page Load Performance **Context:** Dashboard should load in <2 seconds (v7.2.2 benchmark: 25ms) **Setup:** 1. Database with 1000+ memories 2. HTTP server running: `uv run memory server --http` 3. Dashboard at `http://127.0.0.1:8000/` **Execution:** ```bash # Measure page load time (10 iterations) for i in {1..10}; do time curl -s "http://127.0.0.1:8000/" > /dev/null done ``` **Expected Results:** - ✅ Average load time <500ms - ✅ All static assets (HTML/CSS/JS) load successfully - ✅ No JavaScript errors in browser console - ✅ Dashboard functional on first load **Evidence Collection:** ```bash # Browser DevTools → Network tab # - Check "Load" time in waterfall # - Verify no 404/500 errors # - Measure DOMContentLoaded and Load events # Server-side timing time curl -s "http://127.0.0.1:8000/" -o /dev/null -w "%{time_total}\n" ``` **Pass Criteria:** - ✅ Page load <2 seconds (target: <500ms) - ✅ Zero resource loading errors - ✅ Dashboard interactive immediately - ❌ FAIL if >2 seconds or JavaScript errors --- ### Test 6: Memory Operation Performance **Context:** CRUD operations should complete in <1 second (v7.2.2 benchmark: 26ms) **Setup:** 1. Clean database: `python scripts/database/reset_database.py --confirm` 2. HTTP server running **Execution:** 1. **Store operation:** ```bash time curl -s -X POST http://127.0.0.1:8000/api/memories \ -H "Content-Type: application/json" \ -d '{"content":"Performance test memory","tags":["perf-test"]}' \ -w "\n%{time_total}\n" ``` 2. **Search operation:** ```bash time curl -s "http://127.0.0.1:8000/api/search" \ -H "Content-Type: application/json" \ -d '{"query":"performance test","limit":10}' \ -w "\n%{time_total}\n" ``` 3. **Tag search operation:** ```bash time curl -s "http://127.0.0.1:8000/api/search/by-tag" \ -H "Content-Type: application/json" \ -d '{"tags":["perf-test"]}' \ -w "\n%{time_total}\n" ``` 4. **Delete operation:** ```bash HASH=$(curl -s "http://127.0.0.1:8000/api/search/by-tag" \ -H "Content-Type: application/json" \ -d '{"tags":["perf-test"]}' | jq -r '.memories[0].hash') time curl -s -X DELETE "http://127.0.0.1:8000/api/memories/$HASH" \ -w "\n%{time_total}\n" ``` **Expected Results:** - ✅ Store: <100ms - ✅ Search: <200ms - ✅ Tag search: <100ms - ✅ Delete: <100ms **Pass Criteria:** - ✅ All operations <1 second - ✅ HTTP 200 responses - ✅ Correct response format - ❌ FAIL if any operation >1 second --- ## Tag Filtering Correctness ### Test 7: Exact Tag Matching (No False Positives) **Context:** v8.13.0 fixed tag filtering to prevent false positives (e.g., "python" shouldn't match "python3") **Setup:** 1. Clear database 2. Store memories with similar tags **Execution:** ```bash # Store test memories curl -X POST http://127.0.0.1:8000/api/memories \ -H "Content-Type: application/json" \ -d '{"content":"Python programming","tags":["python"]}' curl -X POST http://127.0.0.1:8000/api/memories \ -H "Content-Type: application/json" \ -d '{"content":"Python 3 features","tags":["python3"]}' curl -X POST http://127.0.0.1:8000/api/memories \ -H "Content-Type: application/json" \ -d '{"content":"CPython internals","tags":["cpython"]}' curl -X POST http://127.0.0.1:8000/api/memories \ -H "Content-Type: application/json" \ -d '{"content":"Jython compatibility","tags":["jython"]}' # Search for exact tag "python" curl -s "http://127.0.0.1:8000/api/search/by-tag" \ -H "Content-Type: application/json" \ -d '{"tags":["python"]}' | jq '.memories | length' ``` **Expected Results:** - ✅ Searching "python" returns exactly 1 memory - ✅ Does NOT return python3, cpython, jython - ✅ Exact substring boundary matching works **Evidence Collection:** ```bash # Test each tag variation for tag in python python3 cpython jython; do echo "Testing tag: $tag" curl -s "http://127.0.0.1:8000/api/search/by-tag" \ -H "Content-Type: application/json" \ -d "{\"tags\":[\"$tag\"]}" | jq -r '.memories[].tags[]' done ``` **Pass Criteria:** - ✅ Each search returns only exact tag matches - ✅ Zero false positives (substring matches) - ✅ All 4 memories retrievable individually - ❌ FAIL if any false positive occurs --- ### Test 8: Tag Index Usage (Performance) **Context:** v8.13.0 added tag normalization with relational tables for O(log n) performance **Setup:** 1. Database with 10,000+ memories 2. Verify migration completed: `python scripts/database/validate_migration.py` **Execution:** ```bash # Check query plan uses index sqlite3 ~/Library/Application\ Support/mcp-memory/sqlite_vec.db <<EOF EXPLAIN QUERY PLAN SELECT DISTINCT m.* FROM memories m JOIN memory_tags mt ON m.id = mt.memory_id JOIN tags t ON mt.tag_id = t.id WHERE t.name = 'test-tag'; EOF ``` **Expected Results:** - ✅ Query plan shows `SEARCH` (using index) - ✅ Query plan does NOT show `SCAN` (table scan) - ✅ Tag search completes in <200ms even with 10K memories **Evidence Collection:** ```bash # Verify index exists sqlite3 ~/Library/Application\ Support/mcp-memory/sqlite_vec.db \ "SELECT name FROM sqlite_master WHERE type='index' AND name='idx_memory_tags_tag_id';" # Benchmark tag search time curl -s "http://127.0.0.1:8000/api/search/by-tag" \ -H "Content-Type: application/json" \ -d '{"tags":["test-tag"]}' -o /dev/null -w "%{time_total}\n" ``` **Pass Criteria:** - ✅ Index exists and is used (SEARCH in query plan) - ✅ Tag search <200ms with 10K+ memories - ✅ Sub-linear scaling (2x data ≠ 2x time) - ❌ FAIL if SCAN appears or >500ms with 10K memories --- ## MCP Protocol Compliance ### Test 9: MCP Tool Schema Validation **Context:** Ensure all MCP tools conform to protocol schema **Setup:** 1. Start MCP server: `uv run memory server` 2. Use MCP Inspector: `npx @modelcontextprotocol/inspector uv run memory server` **Execution:** 1. Connect with MCP Inspector 2. List all tools: `tools/list` 3. Validate each tool schema: - Required fields present (name, description, inputSchema) - Input schema is valid JSON Schema - All parameters documented **Expected Results:** - ✅ All 13 core tools listed - ✅ Each tool has valid JSON Schema - ✅ No schema validation errors - ✅ Tool descriptions are concise (<300 tokens each) **Evidence Collection:** ```bash # Capture tools/list output npx @modelcontextprotocol/inspector uv run memory server \ --command "tools/list" > tools_schema.json # Validate schema format cat tools_schema.json | jq '.tools[] | {name, inputSchema}' ``` **Pass Criteria:** - ✅ 13 tools exposed (26 after v8.13.0 consolidation → 13) - ✅ All schemas valid JSON Schema Draft 07 - ✅ No missing required fields - ❌ FAIL if any tool lacks proper schema --- ### Test 10: MCP Tool Execution **Context:** Verify all tools execute correctly via MCP protocol **Setup:** 1. MCP server running 2. MCP Inspector connected **Execution:** 1. **Test store_memory:** ```json { "name": "store_memory", "arguments": { "content": "MCP protocol test memory", "tags": ["mcp-test", "protocol-validation"], "metadata": {"type": "test"} } } ``` 2. **Test recall_memory:** ```json { "name": "recall_memory", "arguments": { "query": "last week", "n_results": 5 } } ``` 3. **Test search_by_tag:** ```json { "name": "search_by_tag", "arguments": { "tags": ["mcp-test"], "match_mode": "any" } } ``` 4. **Test delete_by_tag:** ```json { "name": "delete_by_tag", "arguments": { "tags": ["mcp-test"], "match_mode": "all" } } ``` **Expected Results:** - ✅ All tool calls return valid MCP responses - ✅ No protocol errors or timeouts - ✅ Response format matches tool schema - ✅ Operations reflect in database **Pass Criteria:** - ✅ 4/4 tools execute successfully - ✅ Responses valid JSON - ✅ Database state matches operations - ❌ FAIL if any tool returns error or invalid format --- ## Test Execution Guide ### Running All Regression Tests ```bash # 1. Set up test environment export MCP_MEMORY_STORAGE_BACKEND=sqlite_vec export MCP_MEMORY_SQLITE_PRAGMAS=busy_timeout=15000,journal_mode=WAL # 2. Clear test data python scripts/database/reset_database.py --confirm # 3. Run automated tests pytest tests/unit/test_exact_tag_matching.py pytest tests/unit/test_query_plan_validation.py pytest tests/unit/test_performance_benchmark.py # 4. Run manual tests (follow each test's Execution section) # - Document results in checklist format # - Capture evidence (logs, screenshots, timing data) # - Mark pass/fail for each test # 5. Generate test report python scripts/testing/generate_regression_report.py \ --output docs/testing/regression-report-$(date +%Y%m%d).md ``` ### Test Frequency - **Pre-Release:** All regression tests MUST pass - **Post-PR Merge:** Run affected test categories - **Weekly:** Automated subset (performance, tag filtering) - **Monthly:** Full regression suite ### Reporting Issues If any test fails: 1. Create GitHub issue with label `regression` 2. Include test name, evidence, and reproduction steps 3. Link to relevant commit/PR that may have caused regression 4. Add to release blockers if critical functionality affected --- ## Appendix: Test Data Generation ### Create Large Test Dataset ```bash # Generate 10,000 test memories for performance testing python scripts/testing/generate_test_data.py \ --count 10000 \ --tags-per-memory 3 \ --output test-data-10k.json # Import into database curl -X POST http://127.0.0.1:8000/api/memories/batch \ -H "Content-Type: application/json" \ -d @test-data-10k.json ``` ### Cleanup Test Data ```bash # Remove all test data by tag curl -X POST http://127.0.0.1:8000/api/memories/delete-by-tag \ -H "Content-Type: application/json" \ -d '{"tags": ["test", "perf-test", "mcp-test", "hybrid-test", "switch-test"], "match_mode": "any"}' ``` --- **Last Updated:** 2025-11-05 **Version:** 1.0 **Related:** [Release Checklist](release-checklist.md), [PR Review Guide](pr-review-guide.md)

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/doobidoo/mcp-memory-service'

If you have feedback or need assistance with the MCP directory API, please join our Discord server