SystemPrompt Coding Agent

Official

Overview Schema Related Servers Score Discussions

testing-framework.md•8.52 KiB

# Testing Framework ## Overview The SystemPrompt Coding Agent includes a comprehensive End-to-End (E2E) testing framework built specifically for testing MCP (Model Context Protocol) servers. The framework validates the complete flow of AI agent orchestration, from task creation through completion. ## Architecture ``` Test Runner │ ├── MCP Client │ ├── HTTP Transport │ └── Notification Handlers │ ├── Test Reporter │ ├── HTML Reports │ └── Markdown Reports │ └── Test Utils ├── Environment Detection ├── Logging └── Assertions ``` ## Core Components ### 1. **Test Runner** Main test orchestration and execution. **Features:** - Sequential test execution - Error handling and recovery - Timeout management - Result aggregation ### 2. **MCP Client Integration** Full MCP protocol client for testing. **Capabilities:** - Tool invocation - Resource reading - Notification handling - Progress tracking ### 3. **Test Reporter** Comprehensive test reporting system. **Output Formats:** - **HTML Reports** - Interactive, styled reports - **Markdown Reports** - Git-friendly text reports - **Console Output** - Real-time test progress ### 4. **Test Utilities** Helper functions and common patterns. **Includes:** - Environment configuration - URL detection (local/tunnel) - Assertion helpers - Timing utilities ## Test Structure ### Basic Test Pattern ```typescript async function testCreateTaskFlow( client: Client, reporter: TestReporter ): Promise<void> { // 1. Setup const timestamp = Date.now(); const branchName = `e2e-test-${timestamp}`; // 2. Execute const result = await client.callTool({ name: 'create_task', arguments: { tool: 'CLAUDECODE', branch: branchName, instructions: 'Create hello.html' } }); // 3. Verify if (result.content?.[0]?.text?.includes('created')) { reporter.addSuccess('Task created successfully'); } else { reporter.addError('Task creation failed'); } // 4. Cleanup await client.callTool({ name: 'end_task', arguments: { task_id: taskId } }); } ``` ### Notification Handling ```typescript // Set up notification handlers client.setNotificationHandler( ResourceUpdatedNotificationSchema, async (notification) => { const { uri } = notification.params; // React to task updates if (uri.startsWith('task://')) { const resource = await client.readResource({ uri }); const task = JSON.parse(resource.contents[0].text); // Track progress reporter.addLog( taskId, `Status: ${task.status}, Progress: ${task.progress}%` ); } } ); ``` ## Running Tests ### Local Testing ```bash # Run against local server npm run test:e2e ``` ### Tunnel Testing ```bash # Terminal 1: Start server with tunnel npm run tunnel # Terminal 2: Run tests against tunnel npm run test:tunnel ``` ### Environment Variables ```bash # .env configuration MCP_BASE_URL=http://localhost:3000 # Override base URL TUNNEL_MODE=true # Enable tunnel detection TEST_TIMEOUT=120000 # Test timeout (ms) ``` ## Test Reports ### HTML Report Features - **Summary Dashboard** - Pass/fail statistics - **Timeline View** - Execution timeline - **Detailed Logs** - Step-by-step execution - **Notification History** - All MCP notifications - **Error Details** - Stack traces and context ### Report Location ``` e2e-test/typescript/test-reports/ ├── report-2024-12-20T10-30-45.html ├── report-2024-12-20T10-30-45.md └── latest.html -> report-2024-12-20T10-30-45.html ``` ## Writing New Tests ### 1. Create Test Function ```typescript async function testNewFeature( client: Client, reporter: TestReporter ): Promise<void> { const test = reporter.startTest('New Feature Test'); try { // Your test logic here test.pass('Feature works correctly'); } catch (error) { test.fail(`Feature failed: ${error.message}`); throw error; } } ``` ### 2. Add to Test Suite ```typescript // In test-e2e.ts const tests = [ testCreateTaskFlow, testNewFeature, // Add your test // ... other tests ]; ``` ### 3. Use Test Utilities ```typescript import { createMCPClient, log, sleep, waitForCondition } from './utils/test-utils.js'; // Wait for task completion await waitForCondition( async () => { const task = await getTask(taskId); return task.status === 'completed'; }, { timeout: 60000, interval: 2000 } ); ``` ## Best Practices ### 1. **Test Isolation** - Use unique branch names with timestamps - Clean up resources after tests - Don't depend on previous test state ### 2. **Timeout Management** - Set appropriate timeouts for AI operations - Use shorter timeouts for quick operations - Implement retry logic for flaky operations ### 3. **Assertion Strategy** - Verify both success responses and side effects - Check resource states match expectations - Validate notification sequences ### 4. **Error Handling** - Catch and report all errors - Include context in error messages - Clean up even on failure ### 5. **Reporting** - Log all significant events - Include timing information - Capture notification data ## Common Test Scenarios ### 1. Task Creation and Completion ```typescript // Create task const createResult = await client.callTool({ name: 'create_task', arguments: { tool: 'CLAUDECODE', instructions: 'Implement authentication' } }); // Wait for completion await waitForTaskCompletion(client, taskId); // Verify results const task = await client.readResource({ uri: `task://${taskId}` }); ``` ### 2. Progress Monitoring ```typescript // Track progress updates const progressUpdates: number[] = []; client.setNotificationHandler( ResourceUpdatedNotificationSchema, (notif) => { if (notif.params.uri === `task://${taskId}`) { const task = JSON.parse(/* ... */); progressUpdates.push(task.progress); } } ); // Verify progress increments expect(progressUpdates).toEqual([0, 25, 50, 75, 100]); ``` ### 3. Error Scenarios ```typescript // Test invalid inputs const errorResult = await client.callTool({ name: 'create_task', arguments: { tool: 'INVALID_TOOL', instructions: 'Test error' } }); expect(errorResult.isError).toBe(true); expect(errorResult.content[0].text).toContain('error'); ``` ## Debugging Tests ### Enable Verbose Logging ```typescript // Set debug level process.env.LOG_LEVEL = 'debug'; // Add custom logging log.debug('Task state:', taskState); log.info('Notification received:', notification); ``` ### Inspect MCP Traffic ```typescript // Log all MCP requests/responses client.on('request', (req) => { console.log('MCP Request:', JSON.stringify(req, null, 2)); }); client.on('response', (res) => { console.log('MCP Response:', JSON.stringify(res, null, 2)); }); ``` ### Save Test Artifacts ```typescript // Save task details for debugging const taskDetails = await client.readResource({ uri: `task://${taskId}` }); fs.writeFileSync( `test-artifacts/task-${taskId}.json`, taskDetails.contents[0].text ); ``` ## CI/CD Integration ### GitHub Actions Example ```yaml name: E2E Tests on: [push, pull_request] jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - name: Setup Node.js uses: actions/setup-node@v3 with: node-version: '18' - name: Install dependencies run: npm ci - name: Start server run: | docker-compose up -d npm run wait-for-ready - name: Run E2E tests run: npm run test:e2e env: ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }} - name: Upload reports if: always() uses: actions/upload-artifact@v3 with: name: test-reports path: e2e-test/typescript/test-reports/ ``` ## Performance Testing ### Measure Operation Times ```typescript const timer = reporter.startTimer('Task Creation'); const result = await client.callTool({ name: 'create_task', arguments: { /* ... */ } }); timer.end(); reporter.addMetric('task_creation_time', timer.duration); ``` ### Load Testing ```typescript // Parallel task creation const tasks = await Promise.all( Array(10).fill(0).map((_, i) => client.callTool({ name: 'create_task', arguments: { branch: `load-test-${i}`, instructions: 'Simple task' } }) ) ); // Measure throughput reporter.addMetric('tasks_per_second', 10 / elapsedSeconds); ```

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/systempromptio/systemprompt-code-orchestrator'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

testing-framework.md•8.52 KiB