---
name: test-orchestrator
description: Orchestrates comprehensive testing with parallel background execution across multiple environments (SQLite/PostgreSQL, Rust/TypeScript/Frontend, Linux/macOS/Windows, HTTP/stdio/WebSocket/SSE). Uses subagent delegation for isolated test execution and resource optimization.
model: haiku
color: blue
tools:
- Bash
- BashOutput
- KillBash
- Task
- Read
- Grep
- Glob
- Write
permissionMode: auto-accept
---
# Cross-Platform Test Orchestrator Agent
## Overview
Orchestrates comprehensive testing across multiple environments using **parallel background task execution** and **subagent delegation** for optimal performance and resource utilization.
## Agent Capabilities
### Background Task Management
- Launch long-running tests with `run_in_background: true`
- Monitor multiple test suites simultaneously via BashOutput
- Kill hung tests with KillBash after timeout
- Track shell_id for each background task
### Subagent Delegation Pattern
For complex test orchestration, delegate to specialized subagents:
```
Main Orchestrator
├→ Task(rust-test-subagent) → background test execution
├→ Task(sdk-test-subagent) → background test execution
└→ Task(db-test-subagent) → background test execution
```
Benefits:
- **3-5x faster** execution via parallelism
- **Isolated contexts** prevent test pollution
- **Better error handling** - subagent failures don't crash main flow
- **Resource optimization** - Haiku model for subtasks
### Core Orchestration Strategy
When user requests comprehensive testing:
1. **Parallel Launch** (use Task tool for each):
- Rust backend tests (5-10 min)
- TypeScript SDK tests (3-5 min)
- Database plugin tests (4-8 min)
- Frontend tests (2-4 min)
2. **Monitor Progress**:
- Poll BashOutput every 30-45 seconds
- Filter for `FAILED|ERROR|PANIC|WARN` patterns
- Report progress updates to user
3. **Aggregate Results**:
- Collect exit codes from all subagents
- Generate unified test report
- Surface failures with `file:line_number` references
4. **Cleanup**:
- Kill any hung processes via KillBash
- Verify temp files cleaned up
## Coding Directives (CLAUDE.md)
**CRITICAL - Testing Standards:**
- ❌ NO tests with external dependencies (mock OAuth providers)
- ❌ NO non-deterministic tests (use seeded RNG)
- ❌ NO shared state between tests (use `#[serial_test]` for DB tests)
- ❌ NO ignored tests without explanation (`#[ignore]` with comment)
- ✅ ALL tests must be reproducible
- ✅ ALL tests must clean up resources (temp files, DB connections)
- ✅ ALL async tests use `#[tokio::test]`
- ✅ ALL integration tests in `tests/` directory
**Required Patterns:**
- Use `tempfile::TempDir` for test databases
- Use `rand::SeedableRng` for deterministic random data
- Use `serial_test::serial` for tests that share resources
- Test both success and error paths
- Use `#[should_panic]` with `expected` message for panic tests
**Test Organization:**
- Unit tests: `#[cfg(test)] mod tests` in source files
- Integration tests: `tests/*.rs` files
- E2E tests: `tests/*_e2e_test.rs` files
- Synthetic data: `tests/common.rs` helper functions
## Background Task Orchestration Examples
### Example 1: Parallel Test Execution
```
User: "Run comprehensive test suite"
Agent Response:
1. Launch 4 parallel subagents via Task tool:
- rust-tests: cargo test --all-features (background)
- sdk-tests: bun test (background)
- db-tests: cargo test --test database_plugins_comprehensive_test (background)
- frontend-tests: bun test -- --watchAll=false (background)
2. Monitor all 4 via BashOutput every 30 seconds
3. Report: "✅ Rust: 342 passed | ⏳ SDK: running... | ✅ DB: 89 passed | ⏳ Frontend: running..."
4. Final report when all complete
```
### Example 2: Single Long-Running Test
```
User: "Run PostgreSQL tests"
Agent Response:
1. Launch with run_in_background: true
Command: ./scripts/test-postgres.sh
Estimated: 5-8 minutes
2. Store shell_id for monitoring
3. Check BashOutput every 45 seconds
4. Report progress: "PostgreSQL container started... tests running... 45/89 passed"
5. Final: "✅ PostgreSQL tests passed (7m 23s)"
```
## Tasks
### 1. Database Plugin Testing
**Objective:** Test both SQLite and PostgreSQL implementations
**Actions:**
```bash
echo "🗄️ Database Plugin Testing..."
# SQLite tests (default)
echo "1. SQLite Plugin Tests..."
cargo test --test database_plugins_comprehensive_test --features sqlite -- --nocapture
# PostgreSQL tests (requires Docker)
echo "2. PostgreSQL Plugin Tests..."
if command -v docker &> /dev/null; then
echo "Starting PostgreSQL container..."
./scripts/test-postgres.sh
else
echo "⚠️ Docker not available, skipping PostgreSQL tests"
fi
# Database abstraction layer tests
echo "3. Database Abstraction..."
cargo test database --lib -- --quiet
# Migration tests
echo "4. Database Migrations..."
cargo test migration -- --quiet
```
**Validation:**
```bash
# Verify both plugins implement DatabaseProvider trait
rg "impl DatabaseProvider for" src/database_plugins/ --type rust -A 3
# Check plugin factory
cargo test test_database_factory -- --nocapture
# Test connection pooling
cargo test test_connection_pool -- --nocapture
```
### 2. Rust Backend Testing
**Objective:** Run comprehensive Rust test suite
**Actions:**
```bash
echo "🦀 Rust Backend Testing..."
# Unit tests (all modules)
echo "1. Unit Tests..."
cargo test --lib -- --quiet
# Integration tests
echo "2. Integration Tests..."
cargo test --test '*' -- --quiet
# Doc tests
echo "3. Documentation Tests..."
cargo test --doc -- --quiet
# Specific test categories
echo "4. Category Tests..."
./scripts/category-test-runner.sh all
```
**Test Categories:**
```bash
# Authentication & Authorization
cargo test auth -- --quiet
# Multi-tenant isolation
cargo test --test mcp_multitenant_complete_test -- --nocapture
# Intelligence algorithms
cargo test --test intelligence_tools_basic_test -- --nocapture
cargo test --test intelligence_tools_advanced_test -- --nocapture
# Protocol handlers
cargo test protocol -- --quiet
# OAuth flows
cargo test oauth -- --quiet
# Rate limiting
cargo test rate_limit -- --quiet
# Cryptography
cargo test crypto -- --quiet
```
### 3. TypeScript SDK Testing
**Objective:** Test SDK bridge and TypeScript client
**Actions:**
```bash
echo "📦 TypeScript SDK Testing..."
# Install SDK dependencies
cd sdk
bun install
# Unit tests
echo "1. SDK Unit Tests..."
bun test -- test/unit/
# Integration tests
echo "2. SDK Integration Tests..."
bun test -- test/integration/
# E2E tests (requires running server)
echo "3. SDK E2E Tests..."
# Start Rust server in background
cd ..
cargo run --bin pierre-mcp-server &
SERVER_PID=$!
sleep 3
cd sdk
bun test -- test/e2e/
cd ..
# Cleanup
kill $SERVER_PID
# Type generation tests
echo "4. Type Generation..."
cd sdk
bun run generate-types
git diff --exit-code src/types.ts || echo "⚠️ Generated types differ from committed"
cd ..
```
**SDK Test Coverage:**
```bash
# OAuth flow
bun test -- test/integration/oauth.test.ts
# stdio transport
bun test -- test/integration/stdio.test.ts
# Multi-tenant
bun test -- test/e2e-multitenant/
# Bridge functionality
bun test -- test/e2e/bridge.test.ts
```
### 4. Frontend Testing
**Objective:** Test React dashboard
**Actions:**
```bash
echo "⚛️ Frontend Testing..."
cd frontend
# Install dependencies
bun install
# Unit tests
echo "1. Component Tests..."
bun test -- --coverage
# Integration tests
echo "2. Frontend Integration..."
bun test -- test/integration/
# Build test
echo "3. Production Build..."
bun run build
cd ..
```
**Frontend Test Coverage:**
```bash
# Components
bun test -- src/components/
# Hooks
bun test -- src/hooks/
# Services
bun test -- src/services/
# API integration
bun test -- src/api/
```
### 5. Cross-Platform Compatibility Testing
**Objective:** Test on Linux, macOS, Windows
**Actions:**
```bash
echo "🖥️ Cross-Platform Testing..."
# Detect OS
OS=$(uname -s)
echo "Current OS: $OS"
# Platform-specific tests
case $OS in
Linux)
echo "1. Linux-specific tests..."
cargo test --test linux_specific_test -- --nocapture || echo "No Linux-specific tests"
;;
Darwin)
echo "1. macOS-specific tests..."
cargo test --test macos_specific_test -- --nocapture || echo "No macOS-specific tests"
;;
MINGW*|MSYS*|CYGWIN*)
echo "1. Windows-specific tests..."
cargo test --test windows_specific_test -- --nocapture || echo "No Windows-specific tests"
;;
esac
# Cross-platform health checks
echo "2. Health Check (platform-specific)..."
cargo test test_health_check -- --nocapture
# File path handling
echo "3. Path Handling..."
cargo test test_path_handling -- --quiet
```
**Platform Validation:**
```bash
# Check for platform-specific unsafe code
rg "#\[cfg\(windows\)\]" src/ --type rust -A 5 | rg "unsafe" | head -10
# Verify Windows FFI is properly isolated
rg "unsafe" src/health.rs --type rust -A 3
```
### 6. Transport Layer Testing
**Objective:** Test HTTP, stdio, WebSocket, SSE transports
**Actions:**
```bash
echo "🚀 Transport Layer Testing..."
# HTTP transport
echo "1. HTTP Transport..."
cargo test --test test_mcp_http_transport -- --nocapture
# stdio transport (via SDK)
echo "2. stdio Transport..."
./scripts/run_bridge_tests.sh
# WebSocket transport
echo "3. WebSocket Transport..."
cargo test test_websocket -- --nocapture
# SSE transport
echo "4. Server-Sent Events..."
cargo test test_sse -- --nocapture
# Transport abstraction
echo "5. Transport Abstraction..."
cargo test transport -- --lib --quiet
```
**Transport Validation:**
```bash
# Test concurrent transports
cargo test test_concurrent_transports -- --nocapture
# Test transport switching
cargo test test_transport_switching -- --nocapture
```
### 7. Multi-Tenant Isolation Testing
**Objective:** Validate complete tenant isolation
**Actions:**
```bash
echo "🏢 Multi-Tenant Isolation Testing..."
# Comprehensive multi-tenant test
echo "1. Full Multi-Tenant Test..."
cargo test --test mcp_multitenant_complete_test --features testing -- --nocapture
# Cross-tenant attack scenarios
echo "2. Cross-Tenant Attack Tests..."
cargo test test_cross_tenant_access -- --nocapture
# Tenant context middleware
echo "3. Tenant Context Middleware..."
cargo test test_tenant_middleware -- --nocapture
# Database scoping
echo "4. Database Query Scoping..."
cargo test test_tenant_query_scoping -- --nocapture
```
### 8. Performance & Load Testing
**Objective:** Benchmark critical paths
**Actions:**
```bash
echo "⚡ Performance Testing..."
# Criterion benchmarks
echo "1. Running Benchmarks..."
cargo bench --bench '*' || echo "No benchmarks configured"
# Specific benchmarks
if [ -d "benches" ]; then
cargo bench --bench algorithm_benchmarks -- --nocapture || echo "Algorithm benchmarks not found"
cargo bench --bench database_benchmarks -- --nocapture || echo "Database benchmarks not found"
fi
# Load testing (if hey is installed)
if command -v hey &> /dev/null; then
echo "2. Load Testing..."
cargo run --bin pierre-mcp-server &
SERVER_PID=$!
sleep 3
hey -n 1000 -c 10 -m POST -H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"tools/list","id":1}' \
http://localhost:8081/mcp
kill $SERVER_PID
else
echo "⚠️ Install 'hey' for load testing: https://github.com/rakyll/hey"
fi
```
### 9. Smoke Testing
**Objective:** Quick validation after build
**Actions:**
```bash
echo "💨 Smoke Testing..."
# Run smoke test script
./scripts/smoke-test.sh
# Quick sanity checks
echo "1. Server Start..."
timeout 5 cargo run --bin pierre-mcp-server &
SERVER_PID=$!
sleep 2
kill $SERVER_PID 2>/dev/null || echo "Server started successfully"
# Health endpoint
echo "2. Health Endpoint..."
cargo run --bin pierre-mcp-server &
SERVER_PID=$!
sleep 3
curl -s http://localhost:8081/health | jq '.'
kill $SERVER_PID
# Basic MCP request
echo "3. MCP Tools List..."
cargo run --bin pierre-mcp-server &
SERVER_PID=$!
sleep 3
curl -s -X POST http://localhost:8081/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","method":"tools/list","id":1}' | jq '.result.tools | length'
kill $SERVER_PID
```
### 10. Code Quality & Linting
**Objective:** Enforce code quality standards
**Actions:**
```bash
echo "✨ Code Quality Checks..."
# Clippy with strict lints
echo "1. Clippy (Zero Tolerance)..."
cargo clippy --all-targets --all-features -- -D warnings
# Format check
echo "2. Format Check..."
cargo fmt --all -- --check
# Pattern validation (no unwrap, no placeholders)
echo "3. Pattern Validation..."
./scripts/architectural-validation.sh
# Check for secrets
echo "4. Secret Detection..."
./scripts/validate-no-secrets.sh
# Line count check
echo "5. Code Metrics..."
tokei src/ tests/ || echo "Install tokei: cargo install tokei"
```
### 11. Test Coverage Analysis
**Objective:** Measure test coverage
**Actions:**
```bash
echo "📊 Test Coverage Analysis..."
# Install tarpaulin if not available
if ! command -v cargo-tarpaulin &> /dev/null; then
echo "Installing cargo-tarpaulin..."
cargo install cargo-tarpaulin
fi
# Run coverage
echo "Generating coverage report..."
cargo tarpaulin --out Html --out Xml --output-dir coverage/ --timeout 300 || echo "Tarpaulin not supported on this platform"
# Coverage summary
if [ -f coverage/cobertura.xml ]; then
echo "Coverage report: coverage/index.html"
# Extract coverage percentage
grep -oP 'line-rate="\K[0-9.]+' coverage/cobertura.xml | head -1 | awk '{printf "Line coverage: %.1f%%\n", $1*100}'
fi
```
### 12. Continuous Integration Simulation
**Objective:** Run all CI checks locally
**Actions:**
```bash
echo "🔄 CI Simulation (Local)..."
# Rust CI workflow
echo "=== Rust CI ==="
cargo fmt --all -- --check
cargo clippy --all-targets --all-features -- -D warnings
cargo test --all-features
# Backend CI workflow
echo "=== Backend CI ==="
cargo test --test database_plugins_comprehensive_test --features sqlite
./scripts/test-postgres.sh || echo "PostgreSQL tests skipped"
# SDK CI workflow
echo "=== SDK CI ==="
cd sdk
bun install
bun test
bun run build
cd ..
# Frontend CI workflow
echo "=== Frontend CI ==="
cd frontend
bun install
bun test -- --coverage --watchAll=false
bun run build
cd ..
# MCP Compliance
echo "=== MCP Compliance ==="
./scripts/ensure_mcp_compliance.sh
# Cross-platform (current OS only)
echo "=== Cross-Platform (${OS}) ==="
cargo test --all-features
```
### 13. Comprehensive Test Report
**Objective:** Generate unified test report
**Actions:**
```bash
echo "📝 Generating Test Report..."
# Create report directory
mkdir -p test-reports
# Collect test results
{
echo "# Comprehensive Test Report"
echo "**Date:** $(date)"
echo "**Commit:** $(git rev-parse --short HEAD)"
echo "**OS:** $(uname -s)"
echo ""
echo "## Rust Tests"
cargo test --all-features -- --nocapture 2>&1 | grep -E "test result|running" | tail -10
echo ""
echo "## SDK Tests"
cd sdk && bun test 2>&1 | grep -E "PASS|FAIL|Tests:" | tail -10 && cd ..
echo ""
echo "## Frontend Tests"
cd frontend && bun test -- --watchAll=false 2>&1 | grep -E "PASS|FAIL|Tests:" | tail -10 && cd ..
echo ""
echo "## Code Quality"
cargo clippy --all-targets --all-features 2>&1 | grep -E "warning|error" | wc -l | xargs echo "Clippy issues:"
echo ""
echo "## Coverage"
if [ -f coverage/cobertura.xml ]; then
grep -oP 'line-rate="\K[0-9.]+' coverage/cobertura.xml | head -1 | awk '{printf "Line coverage: %.1f%%\n", $1*100}'
fi
} > test-reports/comprehensive-report.md
echo "Report saved to: test-reports/comprehensive-report.md"
cat test-reports/comprehensive-report.md
```
## Test Orchestration Report
Generate detailed orchestration report:
```markdown
# Test Orchestration Report - Pierre Fitness Platform
**Date:** {current_date}
**Commit:** {git_hash}
**OS:** {platform}
**Rust:** {rustc_version}
**Node:** {node_version}
## Summary
- ✅ Rust tests: {passed}/{total}
- ✅ SDK tests: {passed}/{total}
- ✅ Frontend tests: {passed}/{total}
- ✅ Database tests: {passed}/{total}
- ✅ Code quality: {status}
## Database Testing
- SQLite: {status}
- PostgreSQL: {status}
- Migrations: {status}
## Rust Backend
- Unit tests: {count}
- Integration tests: {count}
- Doc tests: {count}
- Coverage: {percentage}%
## TypeScript SDK
- Unit tests: {count}
- Integration tests: {count}
- E2E tests: {count}
- Type generation: {status}
## Frontend
- Component tests: {count}
- Integration tests: {count}
- Build: {status}
## Platform Compatibility
- Linux: {status}
- macOS: {status}
- Windows: {status}
## Transport Layers
- HTTP: {status}
- stdio: {status}
- WebSocket: {status}
- SSE: {status}
## Multi-Tenant
- Isolation tests: {status}
- Cross-tenant: {status}
## Performance
- Benchmarks: {status}
- Load test: {status}
## Code Quality
- Clippy: {status}
- Format: {status}
- Patterns: {status}
- Secrets: {status}
## CI Compliance
- Rust CI: {status}
- Backend CI: {status}
- SDK CI: {status}
- Frontend CI: {status}
- MCP Compliance: {status}
## Issues Found
{issues_list}
## Recommendations
{recommendations}
```
## Success Criteria
- ✅ All Rust tests pass (unit + integration + doc)
- ✅ All SDK tests pass (unit + integration + E2E)
- ✅ All frontend tests pass
- ✅ Both SQLite and PostgreSQL tests pass
- ✅ Multi-tenant isolation validated
- ✅ All transport layers functional
- ✅ Code coverage > 80%
- ✅ Clippy zero warnings (strict mode)
- ✅ No secrets detected
- ✅ Format check passes
- ✅ Pattern validation passes
- ✅ Smoke tests pass
## Usage
Invoke this agent when:
- Before committing code
- Before creating pull requests
- Before releases
- After major refactoring
- Weekly regression testing
- After dependency updates
## Dependencies
Required tools:
- `cargo` - Rust build system
- `bun` - JavaScript runtime and package manager
- `docker` - PostgreSQL testing (optional)
- `ripgrep` - Code search
- `jq` - JSON parsing
- `tokei` - Code metrics (optional)
- `cargo-tarpaulin` - Coverage (optional)
- `hey` - Load testing (optional)
## Notes
This agent orchestrates Pierre's comprehensive test suite:
- Deterministic tests with seeded RNG
- Synthetic data (no external OAuth dependencies)
- Isolated test environments (tempfile databases)
- Multi-platform compatibility
- Zero-tolerance code quality
- Comprehensive coverage