Rechtsinformationen Bund DE MCP Server

TESTING.md•8.6 KiB

# Testing Guide for rechtsinformationen-bund-de-mcp This document describes the testing infrastructure and how to run different types of tests. ## Test Suite Overview We have four types of tests: 1. **Golden Tests** (`test`) - 17 real-world legal query scenarios 2. **Generic Tests** (`test:generic`) - Direct MCP protocol testing with 9 scenarios 3. **E2E Tests** (`test:e2e`) - End-to-end testing with mock AI agent (12 scenarios) 4. **API Tests** (`test:api`) - Direct API connectivity testing ## Quick Start ```bash # Run all tests npm run test:all # Run individual test suites npm test # Golden tests (17 scenarios, ~30s) npm run test:generic # MCP protocol tests (9 scenarios, ~20s) npm run test:e2e # E2E mock agent tests (12 scenarios, ~60s) npm run test:api # API connectivity test # Run tests with coverage npm run test:coverage ``` ## E2E Testing with Mock Agent ### What It Tests The E2E test suite (`tests/e2e-mock-agent.test.js`) simulates real AI agent behavior: - **Tool Selection**: Verifies agent always starts with `intelligente_rechtssuche` - **Citation Quality**: Checks that responses include properly formatted markdown links - **Answer Completeness**: Validates responses contain required legal references - **URL Formatting**: Ensures URLs are user-friendly (not API endpoints) - **Stopping Behavior**: Confirms recursion limits are respected (max 5 tool calls) ### Test Scenarios 12 scenarios covering: - **Simple law lookups** (BGB, StGB) - Easy - **Paragraph searches** (§ 242 StGB) - Medium - **Complex legal questions** (Mieterhöhung, Kündigung) - Medium - **Court decisions** (Dashcam ruling) - Medium - **Social law** (SGB XII) - Medium - **English translations** (Employee rights) - Medium - **Multi-law queries** (Data protection) - Hard - **Known limitations** (GG, SGB I) - Hard (tests graceful failure) ### Running E2E Tests ```bash # Standard run npm run test:e2e # Verbose output (shows all console logs) npm run test:e2e:verbose # Watch mode (re-runs on file changes) npm run test:e2e:watch # Run single scenario NODE_OPTIONS='--experimental-vm-modules' jest -t "simple_law_lookup_bgb" ``` ### Understanding E2E Test Output ``` 🤖 MockAgent processing query: "Was ist das BGB?" 🔍 Phase 1: Primary search with intelligente_rechtssuche 📞 Tool call 1: intelligente_rechtssuche({"query":"BGB","limit":5}...) ✅ Tool call 1 completed in 1234ms ✅ Primary search looks good (5 results) - no follow-up needed ✅ MockAgent completed in 1450ms with 1 tool calls 📊 Agent Statistics: Tool calls: 1 Duration: 1450ms Stopped: complete Tools used: intelligente_rechtssuche 🧪 Assertion Results: Passed: ✅ Errors: 0 Warnings: 0 ``` ## Test Architecture ### Helper Classes **MockMCPClient** (`tests/helpers/MockMCPClient.js`): - Spawns MCP server as child process - Handles stdio communication - Provides clean async API for tool calls - Reusable across test suites **MockAgent** (`tests/helpers/MockAgent.js`): - Simulates AI agent decision-making - Rule-based tool selection (not ML) - Follows agent instructions: - Always use `intelligente_rechtssuche` first - Max 5 tool calls (configurable) - Conservative vs aggressive modes **Assertion Helpers** (`tests/helpers/assertions.js`): - `assertToolSequence()` - Tool call validation - `assertCitationQuality()` - Markdown link checking - `assertAnswerCompleteness()` - Response validation - `assertUrlFormatting()` - URL format verification - `assertStoppingBehavior()` - Recursion limit checks ### Test Fixtures **E2E Scenarios** (`tests/fixtures/e2e-scenarios.json`): ```json { "id": "simple_law_lookup_bgb", "query": "Was ist das BGB?", "expected": { "max_tool_calls": 2, "must_use_tools": ["intelligente_rechtssuche"], "must_include_in_response": ["Bürgerliches Gesetzbuch", "BGB"], "must_have_citations": true, "min_citations": 1 } } ``` ## Adding New Test Scenarios 1. Add scenario to `tests/fixtures/e2e-scenarios.json`: ```json { "id": "your_scenario_id", "category": "simple|complex|case-law|etc", "difficulty": "easy|medium|hard", "query": "Your legal question here?", "expected": { "max_tool_calls": 3, "must_use_tools": ["intelligente_rechtssuche"], "optional_tools": ["deutsche_gesetze_suchen"], "must_include_in_response": ["Required", "Terms"], "should_include_in_response": ["Recommended", "Terms"], "must_have_citations": true, "min_citations": 1, "must_cite_sources": ["BGB", "StGB"] }, "tags": ["your-tags"] } ``` 2. Run the test: ```bash npm run test:e2e ``` ## Environment Variables For AI-powered tests (future feature), copy `.env.example` to `.env`: ```bash cp .env.example .env ``` Edit `.env` and add your OpenRouter API key: ```env OPEN_ROUTER_API=your-api-key-here MODEL=google/gemini-2.5-flash ``` **Note**: Current E2E tests use a mock agent (no API key required). ## Test Configuration ### Jest Configuration (`jest.config.js`) - **Environment**: Node.js - **Timeout**: 30 seconds per test - **Coverage**: Optional (run with `--coverage`) - **ES Modules**: Enabled via `NODE_OPTIONS` ### Customizing Mock Agent ```javascript import { MockAgent } from './helpers/MockAgent.js'; // Conservative agent (fewer tool calls) const agent = new MockAgent(client, { maxToolCalls: 3, temperature: 0.1, // Very conservative }); // Aggressive agent (more thorough searching) const agent = new MockAgent(client, { maxToolCalls: 5, temperature: 0.7, // More exploratory }); ``` ## Debugging Tests ### Verbose Logging ```bash # E2E tests with full output npm run test:e2e:verbose # Jest debugging NODE_OPTIONS='--experimental-vm-modules --inspect-brk' jest ``` ### Common Issues **Issue**: `Cannot find module 'X'` - **Solution**: Ensure you're using `NODE_OPTIONS='--experimental-vm-modules'` **Issue**: Tests timeout - **Solution**: Increase timeout in test file: `test(..., 60000)` (60 seconds) **Issue**: MCP server doesn't start - **Solution**: Run `npm run build` first to compile TypeScript **Issue**: API errors (500) - **Solution**: rechtsinformationen.bund.de API may be rate limiting or down ## Coverage Reports Generate coverage report: ```bash npm run test:coverage ``` View HTML report: ```bash open coverage/index.html ``` Coverage thresholds (aspirational): - Branches: 50% - Functions: 50% - Lines: 50% - Statements: 50% ## Continuous Integration Currently no CI/CD pipeline. To add GitHub Actions: 1. Create `.github/workflows/test.yml`: ```yaml name: Tests on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - uses: actions/setup-node@v3 with: node-version: '20' - run: npm install - run: npm run build - run: npm test - run: npm run test:generic - run: npm run test:e2e ``` ## Performance Benchmarks Typical test run times (on M1 Mac): - **Golden Tests**: ~30 seconds (17 scenarios) - **Generic Tests**: ~20 seconds (9 scenarios) - **E2E Tests**: ~60 seconds (12 scenarios) - **All Tests**: ~2 minutes Tool call latency: - Local API: 500-2000ms per call - 500ms rate limiting delay between calls ## Best Practices 1. **Always run tests before committing**: `npm run test:all` 2. **Add tests for new features**: Update `e2e-scenarios.json` 3. **Check coverage**: Run `npm run test:coverage` periodically 4. **Update expected results**: When database coverage changes 5. **Document known limitations**: Use `note` field in scenarios ## Troubleshooting ### Test Failures If E2E tests fail: 1. Check if MCP server built: `npm run build` 2. Verify API connectivity: `npm run test:api` 3. Check for database changes: Some laws may have been added/removed 4. Review scenario expectations: May need updating ### Performance Issues If tests are slow: 1. Check internet connection (API calls) 2. Reduce number of scenarios: Comment out in `e2e-scenarios.json` 3. Increase rate limit delay: May be hitting API limits 4. Run specific tests: `jest -t "scenario_name"` ## Resources - [Jest Documentation](https://jestjs.io/docs/getting-started) - [MCP Protocol Specification](https://spec.modelcontextprotocol.io) - [rechtsinformationen.bund.de API](https://docs.rechtsinformationen.bund.de) ## Contributing When adding tests: 1. Follow existing naming conventions 2. Add descriptive `tags` to scenarios 3. Include `note` field for known limitations 4. Update this documentation if adding new test types 5. Ensure all tests pass before PR ## Support For issues or questions: - Check existing tests for examples - Review helper class documentation - Open an issue with failing test output - Include environment details (Node version, OS)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wolfgangihloff/rechtsinformationen-bund-de-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

TESTING.md•8.6 KiB