SSO MCP Server

e2e-test-plan.md•61.6 KiB

# End-to-End Test Plan: SSO MCP Server **Version**: 1.2 | **Date**: 2025-12-15 | **Status**: Active **Maintained by**: Development Team | **Last Reviewed**: 2025-12-15 **Note**: This document defines the end-to-end testing strategy, test scenarios, and execution plan for the SSO MCP Server. It ensures comprehensive coverage of critical user journeys and system integration points for all server capabilities (checklists, processes, and future tools). --- ## Document Control | Version | Date | Author | Changes | |---------|------|--------|---------| | 1.0 | 2025-12-11 | Development Team | Initial E2E test plan | | 1.1 | 2025-12-13 | Development Team | Add Process Query feature (003-process-query) test scenarios | | 1.2 | 2025-12-15 | Development Team | Update title to reflect multi-function server | **Related Documents**: - Architecture: `docs/architecture.md` - Ground Rules: `memory/ground-rules.md` - Feature Specifications: - `specs/001-mcp-sso-checklist/spec.md` - Checklist feature - `specs/003-process-query/spec.md` - Process Query feature - Standards: `docs/standards.md` --- ## Table of Contents 1. [Introduction](#1-introduction) 2. [Test Strategy](#2-test-strategy) 3. [Test Scope](#3-test-scope) 4. [User Journeys](#4-user-journeys) 5. [Test Scenarios](#5-test-scenarios) 6. [Test Data Management](#6-test-data-management) 7. [Test Environments](#7-test-environments) 8. [Test Framework & Tools](#8-test-framework--tools) 9. [Test Architecture](#9-test-architecture) 10. [Execution Plan](#10-execution-plan) 11. [Reporting & Metrics](#11-reporting--metrics) 12. [Maintenance & Improvement](#12-maintenance--improvement) 13. [Appendices](#13-appendices) --- ## 1. Introduction ### 1.1 Purpose This document establishes the comprehensive end-to-end (E2E) testing strategy for the SSO MCP Server. E2E tests validate that the entire system works correctly as an integrated whole, from MCP protocol communication through Azure authentication to resource retrieval (checklists, processes, and future tools) from local files. ### 1.2 Goals - **Validate critical user journeys** across Azure SSO authentication, MCP tool operations for checklists and processes - **Ensure system integration** between MCP server, Azure Entra ID, and local file system - **Verify business workflows** function correctly in developer environments - **Detect integration issues** early, especially around OAuth 2.0 PKCE flow - **Provide confidence** for releases through automated regression testing - **Document expected behavior** for AI assistant integrations (GitHub Copilot, Claude Code) ### 1.3 Audience - Developers implementing and maintaining the MCP server - QA Engineers validating authentication, checklist, and process tool functionality - DevOps Engineers setting up test infrastructure and CI/CD - Users configuring MCP server in their development environment ### 1.4 System Overview **Product Description**: Local MCP server providing software development checklists and process documentation to AI coding assistants with Azure Entra ID SSO authentication. **Key Components** (from architecture.md): - **MCP Server**: FastMCP with HTTP Streamable transport (port 8080) - **Auth Module**: MSAL Python for OAuth 2.0 PKCE authentication - **Checklist Module**: Local file system with YAML frontmatter parsing - **Process Module**: Local file system with YAML frontmatter parsing + keyword search - **Token Cache**: Encrypted local storage via msal-extensions (~/.sso-mcp-server/token_cache.bin) - **External Integration**: Azure Entra ID for authentication **Reference Architecture**: See `docs/architecture.md` for detailed system architecture. --- ## 2. Test Strategy ### 2.1 Testing Approach **E2E Test Philosophy**: E2E tests in this project focus on: - ✅ **Critical user paths**: SSO authentication, checklist retrieval, checklist listing, process retrieval, process listing, process search - ✅ **System integration points**: MCP protocol ↔ Azure Entra ID ↔ File system (checklists + processes) - ✅ **Business workflows**: Complete developer interaction flows - ✅ **User-visible behavior**: MCP tool responses and error messages - ❌ **NOT unit-level logic** (covered by unit tests) - ❌ **NOT component-level details** (covered by integration tests) **Test Pyramid Position**: ``` /\ ← E2E Tests (Few, slow, system integration) / \ ← Integration Tests (More, faster, module integration) /____\ ← Unit Tests (Many, fast, focused) ``` E2E tests provide the **highest confidence** that authentication, checklist, and process tool calls work end-to-end. ### 2.2 Testing Types **Primary E2E Testing**: - **API-driven tests**: Validate MCP protocol interactions via HTTP - **Authentication flow tests**: Test complete OAuth 2.0 PKCE browser-based login - **Checklist file system validation**: Verify checklist reading and caching - **Process file system validation**: Verify process reading, listing, and search - **Token lifecycle tests**: Validate token persistence and refresh **Supplementary Testing** (within E2E scope): - **Session duration**: 8-hour session without re-authentication - **Error handling**: Actionable error messages for all failure scenarios - **Configuration**: Environment variable and MCP config validation - **Search functionality**: Process keyword search with relevance ranking ### 2.3 Test Levels | Level | Focus | Examples | Execution Frequency | |-------|-------|----------|---------------------| | **Smoke Tests** | Critical happy paths | Auth + get_checklist + get_process | Every commit | | **Regression Tests** | Core features | All MCP tools + token refresh + process search | Daily/nightly | | **Full Suite** | Comprehensive coverage | All scenarios + edge cases | Weekly/pre-release | | **8-Hour Session Test** | Long session validation | Mocked time progression | Pre-release | ### 2.4 Entry and Exit Criteria **Entry Criteria** (when to run E2E tests): - [ ] Unit tests pass ≥ 90% - [ ] Integration tests pass ≥ 85% - [ ] Test environment is available (Azure test tenant configured) - [ ] Test checklist and process files prepared **Exit Criteria** (when E2E test run is complete): - [ ] Smoke tests pass 100% - [ ] Regression tests pass ≥ 95% - [ ] No P0 (critical) failures - [ ] All test reports generated - [ ] Failed tests triaged and documented --- ## 3. Test Scope ### 3.1 In Scope **✅ What IS Covered**: 1. **Authentication Flows**: - Browser-based Azure SSO login (OAuth 2.0 PKCE) - Token persistence across server restarts - Silent re-authentication with valid cached tokens - Token refresh before expiration (<5 minutes remaining) - Re-authentication when refresh fails 2. **MCP Checklist Tool Operations**: - `get_checklist` tool with valid checklist name - `get_checklist` tool with invalid checklist name (error handling) - `list_checklists` tool returning all available checklists - Tool calls without authentication (auth trigger) 3. **MCP Process Tool Operations** (003-process-query): - `get_process` tool with valid process name - `get_process` tool with case-insensitive name matching - `get_process` tool with invalid process name (error handling) - `list_processes` tool returning all available processes - `search_processes` tool with keyword search - Search result relevance ranking (title > description > content) - Search result limit (max 50 results) - Empty search results handling 4. **System Integration**: - MCP HTTP Streamable protocol communication - Azure Entra ID token exchange - Local file system checklist reading - Local file system process reading (separate directory) - YAML frontmatter metadata extraction 5. **Configuration Validation**: - Environment variable loading - Port configuration (default 8080, custom via MCP_PORT) - Checklist directory configuration (CHECKLIST_DIR) - Process directory configuration (PROCESS_DIR, default: ./processes) - Azure credential validation ### 3.2 Out of Scope **❌ What IS NOT Covered** (handled by other test types): 1. **Unit-Level Testing**: - Individual parser function testing → **unit tests** - MSAL library internals → **unit tests** - Search engine relevance algorithm → **unit tests** 2. **Component Integration**: - Auth module internals → **integration tests** - Checklist service internals → **integration tests** - Process service internals → **integration tests** 3. **Non-Functional Testing**: - Load testing with many concurrent users → **performance tests** - Security penetration testing → **security tests** 4. **Third-Party Services**: - Azure Entra ID internal behavior - MSAL library correctness ### 3.3 Testing Boundaries ```mermaid graph LR A[AI Assistant Copilot/Claude] -->|MCP HTTP Request| B[MCP Server Port 8080] B --> C[Auth Module MSAL] C -->|OAuth 2.0 PKCE| D[Azure Entra ID Mocked in Tests] B --> E[Checklist Module] B --> F[Process Module] E -->|Read Files| G[Checklist Dir Test Fixtures] F -->|Read Files| H[Process Dir Test Fixtures] C -->|Read/Write| I[Token Cache Temp Directory] style A fill:#e1f5ff style D fill:#fff3cd style G fill:#d4edda style H fill:#d4edda style I fill:#d4edda ``` **Test Entry Point**: MCP HTTP endpoint (localhost:8080/mcp) **Test Exit Point**: MCP response validation + file system state **External Dependencies**: Azure Entra ID (mocked in automated tests, real in manual tests) --- ## 4. User Journeys ### 4.1 User Personas | Persona | Role | Primary Goals | |---------|------|---------------| | **Developer** | Software developer using AI assistant | Access checklists and processes via AI assistant, follow quality standards and procedures | | **First-Time User** | New developer setting up MCP server | Configure server, complete SSO, verify connection | | **Returning User** | Developer with existing session | Seamlessly resume work without re-authentication | ### 4.2 Critical User Journeys - Authentication #### Journey 1: First-Time Authentication - Priority: P0 **Persona**: First-Time User **Business Value**: Enable secure access to checklists and processes; foundation for all other functionality **Frequency**: Once per machine/profile, then on token expiration **Happy Path**: ```mermaid graph LR A[Start Server] --> B[No Cached Tokens] B --> C[Browser Opens Azure Login] C --> D[Enter Credentials] D --> E[Auth Success] E --> F[Token Cached] F --> G[Server Ready] style A fill:#d4edda style G fill:#d4edda ``` **Steps**: 1. **Step 1**: User starts MCP server (`uv run sso-mcp-server`) - Expected: Server begins initialization 2. **Step 2**: System detects no cached tokens - Expected: Browser opens automatically with Azure login page 3. **Step 3**: User enters Azure Entra ID credentials - Expected: Azure validates credentials, redirects with auth code 4. **Step 4**: Server exchanges code for tokens - Expected: Access and refresh tokens obtained, encrypted and cached 5. **Step 5**: Server becomes ready - Expected: Log message "Server ready", HTTP endpoint available **Alternative Paths**: - **Alt 1**: User cancels browser authentication → Server logs error, allows retry - **Alt 2**: Network interruption during auth → Clear error message, retry option **Failure Scenarios**: - **Fail 1**: Invalid credentials → Azure error displayed, no tokens cached - **Fail 2**: Azure service unavailable → Timeout with actionable error message --- #### Journey 4: Returning User (Silent Re-auth) - Priority: P1 **Persona**: Returning User **Business Value**: Seamless experience, no repeated logins **Frequency**: Every server restart **Happy Path**: ```mermaid graph LR A[Start Server] --> B[Load Cached Tokens] B --> C[Validate Token Expiry] C --> D{Token Valid?} D -->|Yes| E[Server Ready No Browser] D -->|Expired| F[Silent Refresh via MSAL] F -->|Success| E F -->|Fail| G[Browser Auth Required] style A fill:#d4edda style E fill:#d4edda ``` --- ### 4.3 Critical User Journeys - Checklists #### Journey 2: Retrieve Checklist - Priority: P0 **Persona**: Developer **Business Value**: Core value proposition - accessing quality checklists **Frequency**: Multiple times per day **Happy Path**: ```mermaid graph LR A[AI Assistant Tool Call] --> B[MCP Server Receives Request] B --> C[Auth Check Valid Token] C --> D[Read Checklist File] D --> E[Parse Frontmatter] E --> F[Return Content to AI Assistant] style A fill:#d4edda style F fill:#d4edda ``` **Steps**: 1. **Step 1**: AI assistant calls `get_checklist` with name "coding" - Expected: MCP server receives HTTP request 2. **Step 2**: Server validates authentication - Expected: Token valid, request proceeds 3. **Step 3**: Server reads `checklists/coding.md` - Expected: File content loaded 4. **Step 4**: Server parses YAML frontmatter - Expected: Name and description extracted 5. **Step 5**: Server returns checklist content - Expected: JSON response with name, description, content **Failure Scenarios**: - **Fail 1**: Checklist not found → Error with available checklists list - **Fail 2**: Not authenticated → Auth middleware triggers authentication --- #### Journey 3: List Available Checklists - Priority: P1 **Persona**: Developer **Business Value**: Discovery of available quality standards **Frequency**: Occasionally (when exploring available checklists) **Happy Path**: ```mermaid graph LR A[AI Assistant Tool Call] --> B[MCP Server] B --> C[Auth Check] C --> D[Scan Directory] D --> E[Parse All Frontmatter] E --> F[Return Metadata Array] style A fill:#d4edda style F fill:#d4edda ``` **Steps**: 1. **Step 1**: AI assistant calls `list_checklists` 2. **Step 2**: Server validates authentication 3. **Step 3**: Server scans checklist directory for .md files 4. **Step 4**: Server parses frontmatter from each file 5. **Step 5**: Server returns array of {name, description} with count --- ### 4.4 Critical User Journeys - Processes (003-process-query) #### Journey 5: Retrieve Process - Priority: P0 **Persona**: Developer **Business Value**: Access to development process documentation for following correct procedures **Frequency**: Multiple times per day during development activities **Happy Path**: ```mermaid graph LR A[AI Assistant Tool Call] --> B[MCP Server Receives Request] B --> C[Auth Check Valid Token] C --> D[Read Process File] D --> E[Parse Frontmatter] E --> F[Return Content to AI Assistant] style A fill:#d4edda style F fill:#d4edda ``` **Steps**: 1. **Step 1**: AI assistant calls `get_process` with name "code-review" - Expected: MCP server receives HTTP request 2. **Step 2**: Server validates authentication - Expected: Token valid, request proceeds 3. **Step 3**: Server reads `processes/code-review.md` (case-insensitive matching) - Expected: File content loaded 4. **Step 4**: Server parses YAML frontmatter - Expected: Name and description extracted 5. **Step 5**: Server returns process content - Expected: JSON response with name, description, content **Failure Scenarios**: - **Fail 1**: Process not found → Error with available processes list (FR-015) - **Fail 2**: Not authenticated → Auth middleware triggers authentication (FR-013) --- #### Journey 6: List Available Processes - Priority: P1 **Persona**: Developer **Business Value**: Discovery of available development procedures **Frequency**: Occasionally (when exploring available processes) **Happy Path**: ```mermaid graph LR A[AI Assistant Tool Call] --> B[MCP Server] B --> C[Auth Check] C --> D[Scan Process Directory] D --> E[Parse All Frontmatter] E --> F[Return Metadata Array] style A fill:#d4edda style F fill:#d4edda ``` **Steps**: 1. **Step 1**: AI assistant calls `list_processes` 2. **Step 2**: Server validates authentication 3. **Step 3**: Server scans process directory for .md files 4. **Step 4**: Server parses frontmatter from each file 5. **Step 5**: Server returns array of {name, description} with count --- #### Journey 7: Search Processes by Keyword - Priority: P1 **Persona**: Developer **Business Value**: Find relevant processes without knowing exact names **Frequency**: Often (when looking for procedures related to a topic) **Happy Path**: ```mermaid graph LR A[AI Assistant Tool Call] --> B[MCP Server] B --> C[Auth Check] C --> D[Load All Processes] D --> E[Search Keyword in name/desc/content] E --> F[Rank by Relevance] F --> G[Return Top 50 Results] style A fill:#d4edda style G fill:#d4edda ``` **Steps**: 1. **Step 1**: AI assistant calls `search_processes` with keyword "deployment" 2. **Step 2**: Server validates authentication 3. **Step 3**: Server loads all process files 4. **Step 4**: Server searches keyword across name, description, and content (FR-010) 5. **Step 5**: Server ranks results by relevance (title matches > content matches) (FR-011) 6. **Step 6**: Server returns up to 50 matching processes with metadata (FR-012a) **Failure Scenarios**: - **Fail 1**: No matches found → Clear message "No processes matched the search criteria" - **Fail 2**: Not authenticated → Auth middleware triggers authentication --- ### 4.5 Journey Priority Matrix | Priority | Definition | Journeys | Test Frequency | |----------|------------|----------|----------------| | **P0 - Critical** | Core functionality, blocks all usage | Auth, Get Checklist, Get Process | Every build | | **P1 - High** | Important features, user convenience | List Checklists, List Processes, Search Processes, Silent Re-auth | Daily | | **P2 - Medium** | Secondary features | Token refresh, Error handling, Dynamic discovery | Weekly | | **P3 - Low** | Edge cases | Malformed files, Concurrent requests, Empty directories | Pre-release | --- ## 5. Test Scenarios ### 5.1 Scenario Structure Each test scenario follows this structure: ``` Scenario: [Descriptive name] Priority: [P0/P1/P2/P3] Tags: [smoke, regression, critical, etc.] Given [precondition/setup] When [user action] Then [expected result] And [additional validation] ``` ### 5.2 Critical Scenarios (P0) - Authentication #### Scenario 5.2.1: Server Start with No Cached Tokens **Priority**: P0 **Tags**: `smoke`, `authentication`, `critical` **User Journey**: First-Time Authentication **Estimated Duration**: 5-10 seconds (automated), 30 seconds (with browser) **Preconditions**: - Token cache does not exist or is empty - Azure credentials (CLIENT_ID, TENANT_ID) configured - CHECKLIST_DIR and PROCESS_DIR point to valid directories **Test Steps**: ```gherkin Given no token cache exists at ~/.sso-mcp-server/token_cache.bin And environment variables AZURE_CLIENT_ID, AZURE_TENANT_ID, CHECKLIST_DIR are set When the MCP server starts Then server logs "No cached tokens, authentication required" And server attempts to open system browser (mocked in automated tests) And server listens on configured port (default 8080) ``` **Expected Results**: - ✓ Server starts within 5 seconds (SC-007) - ✓ Browser authentication triggered (or mocked) - ✓ HTTP endpoint becomes available - ✓ Log messages indicate authentication status --- #### Scenario 5.2.2: Server Start with Valid Cached Tokens **Priority**: P0 **Tags**: `smoke`, `authentication`, `critical` **User Journey**: Returning User (Silent Re-auth) **Preconditions**: - Token cache exists with valid (non-expired) tokens - Environment variables configured **Test Steps**: ```gherkin Given token cache exists with tokens expiring in 1 hour And environment variables are properly configured When the MCP server starts Then server loads cached tokens silently And server logs "Silent authentication successful" And server becomes ready without opening browser And server responds to MCP requests ``` **Expected Results**: - ✓ No browser window opens - ✓ Server ready in <5 seconds - ✓ Token loaded from encrypted cache - ✓ MCP tools respond to requests --- ### 5.3 Critical Scenarios (P0) - Checklists #### Scenario 5.3.1: Get Checklist with Valid Name **Priority**: P0 **Tags**: `smoke`, `mcp-tool`, `critical`, `checklist` **User Journey**: Retrieve Checklist **Preconditions**: - Server is authenticated - Checklist file `coding.md` exists in CHECKLIST_DIR **Test Steps**: ```gherkin Given server is authenticated and ready And checklist file "coding.md" exists with YAML frontmatter When client sends MCP request to get_checklist with name="coding" Then server returns HTTP 200 response And response contains name="coding" And response contains description from frontmatter And response contains markdown content from file And request completes in <2 seconds (SC-002) ``` **Expected Results**: - ✓ MCP response with correct structure - ✓ Name matches requested checklist - ✓ Description extracted from YAML frontmatter - ✓ Content contains full markdown (excluding frontmatter) - ✓ Response time <2 seconds **Sample Test Data**: ```yaml # checklists/coding.md --- name: Coding Standards Checklist description: Quality checklist for code implementation --- # Coding Standards ## Naming - [ ] Variables use descriptive names ``` --- #### Scenario 5.3.2: Get Checklist with Invalid Name **Priority**: P0 **Tags**: `smoke`, `mcp-tool`, `negative`, `critical`, `checklist` **User Journey**: Retrieve Checklist (error path) **Preconditions**: - Server is authenticated - Checklist "nonexistent" does not exist **Test Steps**: ```gherkin Given server is authenticated and ready And no checklist named "nonexistent" exists When client sends MCP request to get_checklist with name="nonexistent" Then server returns MCP error response And error code is "CHECKLIST_NOT_FOUND" And error message contains "nonexistent" And error message lists available checklists ``` **Expected Results**: - ✓ Error response (not success) - ✓ Error code matches contract: CHECKLIST_NOT_FOUND - ✓ Message is actionable (lists alternatives) --- #### Scenario 5.3.3: List Checklists Returns All Available **Priority**: P0 **Tags**: `smoke`, `mcp-tool`, `critical`, `checklist` **User Journey**: List Available Checklists **Preconditions**: - Server is authenticated - Multiple checklist files exist in CHECKLIST_DIR **Test Steps**: ```gherkin Given server is authenticated and ready And checklist directory contains: coding.md, architecture.md, detailed-design.md When client sends MCP request to list_checklists Then server returns HTTP 200 response And response.checklists is array of length 3 And each checklist has name and description fields And response.count equals 3 And request completes in <1 second (SC-003) ``` **Expected Results**: - ✓ All 3 checklists returned - ✓ Each has name from frontmatter or filename - ✓ Each has description (may be null if not in frontmatter) - ✓ Count matches array length - ✓ Response time <1 second --- ### 5.4 Critical Scenarios (P0) - Processes #### Scenario 5.4.1: Get Process with Valid Name **Priority**: P0 **Tags**: `smoke`, `mcp-tool`, `critical`, `process` **User Journey**: Retrieve Process **Feature**: 003-process-query **Preconditions**: - Server is authenticated - Process file `code-review.md` exists in PROCESS_DIR **Test Steps**: ```gherkin Given server is authenticated and ready And process file "code-review.md" exists with YAML frontmatter When client sends MCP request to get_process with name="code-review" Then server returns HTTP 200 response And response contains name="code-review" And response contains description from frontmatter And response contains markdown content from file And request completes in <2 seconds (process SC-001) ``` **Expected Results**: - ✓ MCP response with correct structure - ✓ Name matches requested process - ✓ Description extracted from YAML frontmatter - ✓ Content contains full markdown (excluding frontmatter) - ✓ Response time <2 seconds **Sample Test Data**: ```yaml # processes/code-review.md --- name: Code Review Process description: Standard procedure for reviewing code changes --- # Code Review Process ## Prerequisites - [ ] PR has description - [ ] Tests are passing ``` --- #### Scenario 5.4.2: Get Process with Case-Insensitive Matching **Priority**: P0 **Tags**: `smoke`, `mcp-tool`, `critical`, `process` **User Journey**: Retrieve Process **Feature**: 003-process-query (FR-004) **Preconditions**: - Server is authenticated - Process file `Code-Review.md` exists in PROCESS_DIR **Test Steps**: ```gherkin Given server is authenticated and ready And process file "Code-Review.md" exists When client sends MCP request to get_process with name="code-review" Then server returns HTTP 200 response And response contains the process content And case-insensitive matching succeeds ``` **Expected Results**: - ✓ Process found regardless of case in request - ✓ Matching is case-insensitive per FR-004 --- #### Scenario 5.4.3: Get Process with Invalid Name **Priority**: P0 **Tags**: `smoke`, `mcp-tool`, `negative`, `critical`, `process` **User Journey**: Retrieve Process (error path) **Feature**: 003-process-query (FR-014, FR-015) **Preconditions**: - Server is authenticated - Process "nonexistent" does not exist **Test Steps**: ```gherkin Given server is authenticated and ready And no process named "nonexistent" exists When client sends MCP request to get_process with name="nonexistent" Then server returns MCP error response And error code is "PROCESS_NOT_FOUND" And error message contains "nonexistent" And error message lists available processes (FR-015) ``` **Expected Results**: - ✓ Error response (not success) - ✓ Error code: PROCESS_NOT_FOUND - ✓ Message is actionable (lists available processes) --- #### Scenario 5.4.4: List Processes Returns All Available **Priority**: P0 **Tags**: `smoke`, `mcp-tool`, `critical`, `process` **User Journey**: List Available Processes **Feature**: 003-process-query (FR-006) **Preconditions**: - Server is authenticated - Multiple process files exist in PROCESS_DIR **Test Steps**: ```gherkin Given server is authenticated and ready And process directory contains: code-review.md, deployment.md, incident-response.md When client sends MCP request to list_processes Then server returns HTTP 200 response And response.processes is array of length 3 And each process has name and description fields And response.count equals 3 And request completes in <1 second (process SC-002) ``` **Expected Results**: - ✓ All 3 processes returned - ✓ Each has name from frontmatter or filename - ✓ Each has description (may be null if not in frontmatter) - ✓ Count matches array length - ✓ Response time <1 second --- ### 5.5 High Priority Scenarios (P1) #### Scenario 5.5.1: Tool Call Without Authentication **Priority**: P1 **Tags**: `authentication`, `regression` **User Journey**: N/A (error handling) **Test Steps**: ```gherkin Given server started but not yet authenticated When client sends MCP request to get_checklist with name="coding" Then server returns MCP error response And error code is "NOT_AUTHENTICATED" And error message provides guidance for authentication ``` **Expected Results**: - ✓ Clear NOT_AUTHENTICATED error - ✓ Actionable guidance in message --- #### Scenario 5.5.2: Token Refresh Before Expiration **Priority**: P1 **Tags**: `authentication`, `regression`, `session` **User Journey**: 8-hour session **Test Steps** (mocked time): ```gherkin Given server is authenticated with token expiring in 4 minutes When server checks token before tool call Then server proactively refreshes token via MSAL And new token is cached And tool call proceeds without error And no browser authentication required ``` **Expected Results**: - ✓ Token refreshed silently when <5 min remaining - ✓ No user interaction required - ✓ Tool calls continue working --- #### Scenario 5.5.3: Dynamic Checklist Discovery **Priority**: P1 **Tags**: `mcp-tool`, `regression`, `checklist` **User Journey**: List Available Checklists **Test Steps**: ```gherkin Given server is running and authenticated And initial list_checklists returns 3 checklists When new file "custom.md" is added to CHECKLIST_DIR And client sends list_checklists request Then response includes 4 checklists (including "custom") And no server restart was required (FR-010) ``` --- #### Scenario 5.5.4: Dynamic Process Discovery **Priority**: P1 **Tags**: `mcp-tool`, `regression`, `process` **User Journey**: List Available Processes **Feature**: 003-process-query (FR-007) **Test Steps**: ```gherkin Given server is running and authenticated And initial list_processes returns 3 processes When new file "new-process.md" is added to PROCESS_DIR And client sends list_processes request Then response includes 4 processes (including "new-process") And no server restart was required (FR-007) ``` --- #### Scenario 5.5.5: Search Processes with Keyword Match **Priority**: P1 **Tags**: `mcp-tool`, `regression`, `process`, `search` **User Journey**: Search Processes by Keyword **Feature**: 003-process-query (FR-009, FR-010, FR-011) **Preconditions**: - Server is authenticated - Multiple process files exist with various content **Test Steps**: ```gherkin Given server is authenticated and ready And process "deployment.md" contains "production deployment" in content And process "release.md" has "deployment" in title When client sends MCP request to search_processes with keyword="deployment" Then server returns search results And results include both "deployment" and "release" processes And "release" (title match) ranks higher than "deployment" (content match) (FR-011) And results include name, description for each match And request completes in <3 seconds (process SC-003) ``` **Expected Results**: - ✓ All matching processes returned - ✓ Title matches ranked higher than content matches - ✓ Response time <3 seconds --- #### Scenario 5.5.6: Search Processes with No Matches **Priority**: P1 **Tags**: `mcp-tool`, `regression`, `process`, `search`, `negative` **User Journey**: Search Processes by Keyword (no results) **Feature**: 003-process-query **Test Steps**: ```gherkin Given server is authenticated and ready And no process contains the keyword "xyznonexistent" When client sends MCP request to search_processes with keyword="xyznonexistent" Then server returns empty results array And response message indicates no matches found ``` --- #### Scenario 5.5.7: Search Processes Respects Result Limit **Priority**: P1 **Tags**: `mcp-tool`, `regression`, `process`, `search` **User Journey**: Search Processes by Keyword **Feature**: 003-process-query (FR-012a) **Preconditions**: - Server is authenticated - More than 50 process files exist with matching content **Test Steps**: ```gherkin Given server is authenticated and ready And process directory contains 100 files all containing "process" When client sends MCP request to search_processes with keyword="process" Then server returns at most 50 results (FR-012a) And results are ordered by relevance ``` --- ### 5.6 Medium Priority Scenarios (P2) | Scenario ID | Scenario Name | Tags | Journey | |-------------|---------------|------|---------| | 5.6.1 | Port Conflict Detection | `configuration`, `negative` | Setup | | 5.6.2 | Custom Port Configuration | `configuration` | Setup | | 5.6.3 | Missing CHECKLIST_DIR | `configuration`, `negative` | Setup | | 5.6.4 | Missing PROCESS_DIR Uses Default | `configuration`, `process` | Setup | | 5.6.5 | Malformed Checklist Frontmatter | `error-handling`, `checklist` | Get Checklist | | 5.6.6 | Malformed Process Frontmatter | `error-handling`, `process` | Get Process | | 5.6.7 | Empty Checklist Directory | `edge-case`, `checklist` | List Checklists | | 5.6.8 | Empty Process Directory | `edge-case`, `process` | List Processes | | 5.6.9 | Process Search Partial Matching | `process`, `search` | Search Processes | --- #### Scenario 5.6.4: Missing PROCESS_DIR Uses Default **Priority**: P2 **Tags**: `configuration`, `process` **Feature**: 003-process-query (FR-017) **Test Steps**: ```gherkin Given PROCESS_DIR environment variable is not set And ./processes directory exists with process files When server starts Then server uses ./processes as default directory (FR-017) And list_processes returns processes from default directory ``` --- #### Scenario 5.6.6: Malformed Process Frontmatter Handling **Priority**: P2 **Tags**: `error-handling`, `process`, `edge-case` **Feature**: 003-process-query **Test Steps**: ```gherkin Given process file "malformed.md" has no YAML frontmatter When client requests get_process with name="malformed" Then server returns process content And name defaults to filename "malformed" And description is null or empty ``` --- #### Scenario 5.6.8: Empty Process Directory **Priority**: P2 **Tags**: `edge-case`, `process` **Feature**: 003-process-query **Test Steps**: ```gherkin Given process directory exists but is empty When client sends list_processes request Then server returns empty array with count=0 And response includes clear message about empty directory ``` --- #### Scenario 5.6.9: Process Search Partial Matching **Priority**: P2 **Tags**: `process`, `search` **Feature**: 003-process-query (FR-012) **Test Steps**: ```gherkin Given process "deployment.md" exists with content "production deployment steps" When client searches with keyword="deploy" Then "deployment.md" is included in results (partial match) And FR-012 partial keyword matching is validated ``` --- ### 5.7 Edge Cases & Negative Scenarios #### Scenario 5.7.1: Server Start with Port Already in Use **Priority**: P2 **Tags**: `negative`, `configuration` **Test Steps**: ```gherkin Given port 8080 is already in use by another process When MCP server attempts to start on port 8080 Then server fails with clear error message And error message indicates port is in use And error message suggests using MCP_PORT environment variable ``` --- #### Scenario 5.7.2: Checklist File with Invalid Encoding **Priority**: P3 **Tags**: `negative`, `edge-case`, `checklist` **Test Steps**: ```gherkin Given checklist file "broken.md" has invalid UTF-8 encoding When client requests get_checklist with name="broken" Then server returns FILE_READ_ERROR And error message indicates encoding issue And other checklists remain accessible ``` --- #### Scenario 5.7.3: Process File with Invalid Encoding **Priority**: P3 **Tags**: `negative`, `edge-case`, `process` **Test Steps**: ```gherkin Given process file "broken.md" has invalid UTF-8 encoding When client requests get_process with name="broken" Then server returns FILE_READ_ERROR And error message indicates encoding issue And other processes remain accessible ``` --- #### Scenario 5.7.4: Process Directory Does Not Exist **Priority**: P2 **Tags**: `negative`, `configuration`, `process` **Feature**: 003-process-query **Test Steps**: ```gherkin Given PROCESS_DIR points to non-existent directory "/invalid/path" When server starts or list_processes is called Then server returns clear error message And error indicates directory is missing or not configured ``` --- ### 5.8 Cross-Feature Integration Scenarios #### Scenario 5.8.1: Full Workflow - Auth to Checklist Retrieval **Priority**: P0 **Tags**: `integration`, `smoke`, `critical`, `checklist` **Features Involved**: Authentication, MCP Server, Checklist Module **Workflow**: ```mermaid sequenceDiagram participant Client as AI Client participant Server as MCP Server participant Auth as Auth Module participant Azure as Azure (Mocked) participant FS as File System Client->>Server: Start server Server->>Auth: Check tokens Auth->>Azure: Authenticate (mocked) Azure-->>Auth: Token response Auth-->>Server: Authenticated Server-->>Client: Ready Client->>Server: get_checklist("coding") Server->>Auth: Validate token Auth-->>Server: Valid Server->>FS: Read coding.md FS-->>Server: File content Server-->>Client: Checklist response ``` **Test Steps**: ```gherkin Given clean test environment with no cached tokens And test checklist files in place When server starts and authenticates (mocked Azure) And client sends get_checklist("coding") request Then authentication completes successfully And checklist content is returned And entire flow completes in <30 seconds (SC-001) ``` --- #### Scenario 5.8.2: Full Workflow - Auth to Process Retrieval **Priority**: P0 **Tags**: `integration`, `smoke`, `critical`, `process` **Features Involved**: Authentication, MCP Server, Process Module **Feature**: 003-process-query **Workflow**: ```mermaid sequenceDiagram participant Client as AI Client participant Server as MCP Server participant Auth as Auth Module participant Azure as Azure (Mocked) participant FS as File System Client->>Server: Start server Server->>Auth: Check tokens Auth->>Azure: Authenticate (mocked) Azure-->>Auth: Token response Auth-->>Server: Authenticated Server-->>Client: Ready Client->>Server: get_process("code-review") Server->>Auth: Validate token Auth-->>Server: Valid Server->>FS: Read code-review.md from PROCESS_DIR FS-->>Server: File content Server-->>Client: Process response ``` **Test Steps**: ```gherkin Given clean test environment with no cached tokens And test process files in place When server starts and authenticates (mocked Azure) And client sends get_process("code-review") request Then authentication completes successfully And process content is returned And entire flow completes in <30 seconds ``` --- #### Scenario 5.8.3: Combined Checklist and Process Operations **Priority**: P1 **Tags**: `integration`, `regression` **Features Involved**: Authentication, Checklist Module, Process Module **Test Steps**: ```gherkin Given server is authenticated and ready And checklist and process files exist When client sends get_checklist("coding") And client sends get_process("code-review") And client sends list_checklists And client sends list_processes And client sends search_processes with keyword="review" Then all 5 operations succeed And checklists and processes are returned from separate directories And search returns relevant processes ``` --- ## 6. Test Data Management ### 6.1 Test Data Strategy **Approach**: Fixture-based with temp directories for isolation **Principles**: - **Isolation**: Each test uses its own temp CHECKLIST_DIR, PROCESS_DIR, and TOKEN_CACHE - **Repeatability**: Fixed fixture files produce consistent results - **Cleanup**: Temp directories deleted after test completion - **Privacy**: No real Azure credentials in test data - **Realism**: Test checklists and processes mimic production format ### 6.2 Test Data Types | Data Type | Source | Storage | Lifecycle | |-----------|--------|---------|-----------| | **Checklist Files** | Fixed fixtures | Temp directory | Created before test, deleted after | | **Process Files** | Fixed fixtures | Temp directory | Created before test, deleted after | | **Token Cache** | Mocked MSAL response | Temp file | Created during test, deleted after | | **Azure Responses** | Mock data | In-memory | Per test | | **Configuration** | Environment variables | Test setup | Per test session | ### 6.3 Test Data Generation **Checklist Fixture Factory**: ```python # tests/fixtures/checklist_factory.py def create_checklist( name: str = "test-checklist", description: str = "Test checklist for E2E tests", content: str = "- [ ] Test item" ) -> str: """Generate checklist file content with frontmatter.""" return f"""--- name: {name} description: {description} --- # {name} {content} """ def create_temp_checklist_dir(checklists: list[dict]) -> Path: """Create temp directory with checklist fixtures.""" temp_dir = Path(tempfile.mkdtemp()) for checklist in checklists: filepath = temp_dir / f"{checklist['filename']}.md" filepath.write_text(create_checklist(**checklist)) return temp_dir ``` **Process Fixture Factory**: ```python # tests/fixtures/process_factory.py def create_process( name: str = "test-process", description: str = "Test process for E2E tests", content: str = "## Steps\n1. Step one\n2. Step two" ) -> str: """Generate process file content with frontmatter.""" return f"""--- name: {name} description: {description} --- # {name} {content} """ def create_temp_process_dir(processes: list[dict]) -> Path: """Create temp directory with process fixtures.""" temp_dir = Path(tempfile.mkdtemp()) for process in processes: filepath = temp_dir / f"{process['filename']}.md" filepath.write_text(create_process(**process)) return temp_dir ``` ### 6.4 Test Data Cleanup **Cleanup Strategy**: ```python @pytest.fixture def temp_checklist_dir(): """Fixture providing temp checklist directory with cleanup.""" temp_dir = create_temp_checklist_dir([ {"filename": "coding", "name": "Coding Standards"}, {"filename": "architecture", "name": "Architecture Review"}, ]) yield temp_dir shutil.rmtree(temp_dir) @pytest.fixture def temp_process_dir(): """Fixture providing temp process directory with cleanup.""" temp_dir = create_temp_process_dir([ {"filename": "code-review", "name": "Code Review Process"}, {"filename": "deployment", "name": "Deployment Process"}, {"filename": "incident-response", "name": "Incident Response"}, ]) yield temp_dir shutil.rmtree(temp_dir) @pytest.fixture def temp_token_cache(): """Fixture providing temp token cache path with cleanup.""" temp_file = Path(tempfile.mktemp(suffix=".bin")) yield temp_file if temp_file.exists(): temp_file.unlink() ``` ### 6.5 Sensitive Data Handling **Rules**: - ❌ **NEVER** use real Azure credentials in automated tests - ❌ **NEVER** use real user tokens - ❌ **NEVER** commit Azure credentials to git - ✅ **ALWAYS** use mocked MSAL responses - ✅ **ALWAYS** use environment variables for real Azure testing - ✅ **ALWAYS** use test Azure tenant for manual E2E tests **Test Azure Configuration** (for manual testing only): ```bash # .env.test (NOT committed to git) AZURE_CLIENT_ID=test-app-client-id AZURE_TENANT_ID=test-tenant-id CHECKLIST_DIR=./tests/fixtures/checklists PROCESS_DIR=./tests/fixtures/processes ``` --- ## 7. Test Environments ### 7.1 Environment Configurations | Environment | Purpose | Configuration | External Services | |-------------|---------|---------------|-------------------| | **Local** | Development testing | localhost:8080 | Azure mocked | | **CI** | Automated test runs | Ephemeral port | Azure mocked | | **Manual QA** | Human validation | localhost:8080 | Real Azure test tenant | ### 7.2 Environment Setup **Local Development Environment**: ```bash # Clone repository git clone <repository-url> cd sso-mcp-server # Install dependencies uv sync # Create test fixtures mkdir -p tests/fixtures/checklists tests/fixtures/processes cat > tests/fixtures/checklists/coding.md << 'EOF' --- name: Coding Standards description: Test checklist --- # Coding Standards - [ ] Test item EOF cat > tests/fixtures/processes/code-review.md << 'EOF' --- name: Code Review Process description: Test process --- # Code Review Process - [ ] Check tests pass EOF # Run E2E tests with mocked Azure uv run pytest tests/e2e/ -v ``` **CI Environment**: ```yaml # .github/workflows/e2e-tests.yml - name: Run E2E tests env: AZURE_CLIENT_ID: mock-client-id AZURE_TENANT_ID: mock-tenant-id CHECKLIST_DIR: ./tests/fixtures/checklists PROCESS_DIR: ./tests/fixtures/processes MCP_PORT: 8080 run: uv run pytest tests/e2e/ -v --timeout=60 ``` ### 7.3 Service Mocking Strategy **What to Mock**: - Azure Entra ID OAuth endpoints - Browser interactions (webbrowser.open) - System keychain (msal-extensions encryption) **Mocking Approach**: ```python # tests/e2e/conftest.py import pytest from unittest.mock import patch, MagicMock @pytest.fixture def mock_msal(): """Mock MSAL PublicClientApplication for automated E2E tests.""" with patch("msal.PublicClientApplication") as mock_app: instance = MagicMock() instance.get_accounts.return_value = [] instance.acquire_token_interactive.return_value = { "access_token": "mock-access-token", "refresh_token": "mock-refresh-token", "expires_in": 3600, "token_type": "Bearer", } instance.acquire_token_silent.return_value = None mock_app.return_value = instance yield instance @pytest.fixture def mock_browser(): """Mock browser opening for automated tests.""" with patch("webbrowser.open") as mock_open: yield mock_open ``` ### 7.4 Infrastructure Requirements **Compute** (for CI): - CPU: 2 cores minimum - Memory: 4GB RAM minimum - Storage: 10GB **Dependencies**: - Python 3.11+ - uv package manager - pytest, pytest-asyncio **Network**: - No external network required (all mocked) - For manual testing: Internet access to Azure Entra ID --- ## 8. Test Framework & Tools ### 8.1 Framework Selection **Primary Testing Framework**: pytest + pytest-asyncio **Justification**: - ✅ Native Python integration (matches project stack) - ✅ Excellent async support via pytest-asyncio - ✅ Rich fixture system for test isolation - ✅ Extensive plugin ecosystem - ✅ Already used for unit/integration tests (consistency) **Alternative Considered**: Robot Framework - ❌ Why not chosen: Over-engineered for CLI application, team unfamiliar ### 8.2 Tool Stack | Category | Tool | Purpose | Version | |----------|------|---------|---------| | **Test Runner** | pytest | Test execution | ^8.0.0 | | **Async Support** | pytest-asyncio | Async test support | ^0.23.0 | | **HTTP Testing** | httpx | MCP HTTP requests | ^0.27.0 | | **Mocking** | unittest.mock | Service mocking | stdlib | | **Time Mocking** | freezegun | Token expiry tests | ^1.2.0 | | **Coverage** | pytest-cov | Coverage reporting | ^4.0.0 | | **Timeouts** | pytest-timeout | Prevent hanging tests | ^2.3.0 | ### 8.3 Framework Configuration **pytest Configuration**: ```toml # pyproject.toml [tool.pytest.ini_options] asyncio_mode = "auto" testpaths = ["tests"] addopts = "-v --tb=short --timeout=60" markers = [ "smoke: Quick critical path tests", "regression: Full regression suite", "e2e: End-to-end integration tests", "checklist: Checklist-related tests", "process: Process-related tests", "search: Search functionality tests", ] [tool.pytest-timeout] timeout = 60 method = "thread" ``` ### 8.4 Custom Utilities **MCP Client Helper**: ```python # tests/e2e/helpers/mcp_client.py import httpx from typing import Any class MCPTestClient: """Test client for MCP HTTP requests.""" def __init__(self, base_url: str = "http://localhost:8080"): self.base_url = base_url self.client = httpx.AsyncClient() async def call_tool(self, tool_name: str, arguments: dict[str, Any] = None) -> dict: """Call MCP tool and return response.""" payload = { "jsonrpc": "2.0", "method": "tools/call", "params": { "name": tool_name, "arguments": arguments or {} }, "id": 1 } response = await self.client.post(f"{self.base_url}/mcp", json=payload) return response.json() async def get_checklist(self, name: str) -> dict: """Convenience method for get_checklist tool.""" return await self.call_tool("get_checklist", {"name": name}) async def list_checklists(self) -> dict: """Convenience method for list_checklists tool.""" return await self.call_tool("list_checklists") async def get_process(self, name: str) -> dict: """Convenience method for get_process tool.""" return await self.call_tool("get_process", {"name": name}) async def list_processes(self) -> dict: """Convenience method for list_processes tool.""" return await self.call_tool("list_processes") async def search_processes(self, keyword: str) -> dict: """Convenience method for search_processes tool.""" return await self.call_tool("search_processes", {"keyword": keyword}) ``` --- ## 9. Test Architecture ### 9.1 Design Pattern **Pattern**: Fixture-based with Helper Modules **Structure**: ``` tests/ ├── e2e/ │ ├── __init__.py │ ├── conftest.py # Shared E2E fixtures │ ├── helpers/ │ │ ├── __init__.py │ │ ├── mcp_client.py # MCP HTTP client │ │ ├── server.py # Server lifecycle management │ │ └── fixtures.py # Test data generators │ ├── scenarios/ │ │ ├── __init__.py │ │ ├── test_authentication.py # Auth flow scenarios │ │ ├── test_get_checklist.py # Get checklist scenarios │ │ ├── test_list_checklists.py # List checklists scenarios │ │ ├── test_get_process.py # Get process scenarios │ │ ├── test_list_processes.py # List processes scenarios │ │ ├── test_search_processes.py # Search processes scenarios │ │ └── test_smoke.py # Smoke test suite │ └── fixtures/ │ ├── checklists/ │ │ ├── coding.md │ │ ├── architecture.md │ │ └── detailed-design.md │ └── processes/ │ ├── code-review.md │ ├── deployment.md │ └── incident-response.md ``` ### 9.2 Test Example - Checklists ```python # tests/e2e/scenarios/test_get_checklist.py import pytest from tests.e2e.helpers.mcp_client import MCPTestClient from tests.e2e.helpers.server import start_test_server @pytest.mark.e2e @pytest.mark.smoke @pytest.mark.checklist async def test_get_checklist_returns_content( mock_msal, temp_checklist_dir, temp_token_cache, ): """Scenario 5.3.1: Get checklist with valid name.""" # ARRANGE async with start_test_server( checklist_dir=temp_checklist_dir, token_cache=temp_token_cache, ) as server_url: client = MCPTestClient(server_url) # ACT response = await client.get_checklist("coding") # ASSERT assert "error" not in response result = response["result"] assert result["name"] == "coding" assert result["description"] is not None assert "# Coding Standards" in result["content"] @pytest.mark.e2e @pytest.mark.smoke @pytest.mark.checklist async def test_get_checklist_not_found_returns_error( mock_msal, temp_checklist_dir, temp_token_cache, ): """Scenario 5.3.2: Get checklist with invalid name.""" async with start_test_server( checklist_dir=temp_checklist_dir, token_cache=temp_token_cache, ) as server_url: client = MCPTestClient(server_url) response = await client.get_checklist("nonexistent") assert "error" in response assert response["error"]["code"] == "CHECKLIST_NOT_FOUND" assert "nonexistent" in response["error"]["message"] ``` ### 9.3 Test Example - Processes ```python # tests/e2e/scenarios/test_get_process.py import pytest from tests.e2e.helpers.mcp_client import MCPTestClient from tests.e2e.helpers.server import start_test_server @pytest.mark.e2e @pytest.mark.smoke @pytest.mark.process async def test_get_process_returns_content( mock_msal, temp_checklist_dir, temp_process_dir, temp_token_cache, ): """Scenario 5.4.1: Get process with valid name.""" # ARRANGE async with start_test_server( checklist_dir=temp_checklist_dir, process_dir=temp_process_dir, token_cache=temp_token_cache, ) as server_url: client = MCPTestClient(server_url) # ACT response = await client.get_process("code-review") # ASSERT assert "error" not in response result = response["result"] assert result["name"] == "code-review" assert result["description"] is not None assert "Code Review" in result["content"] @pytest.mark.e2e @pytest.mark.smoke @pytest.mark.process async def test_get_process_case_insensitive( mock_msal, temp_checklist_dir, temp_process_dir, temp_token_cache, ): """Scenario 5.4.2: Get process with case-insensitive matching.""" async with start_test_server( checklist_dir=temp_checklist_dir, process_dir=temp_process_dir, token_cache=temp_token_cache, ) as server_url: client = MCPTestClient(server_url) # Request with different case response = await client.get_process("CODE-REVIEW") assert "error" not in response result = response["result"] assert "code-review" in result["name"].lower() ``` ### 9.4 Test Example - Process Search ```python # tests/e2e/scenarios/test_search_processes.py import pytest from tests.e2e.helpers.mcp_client import MCPTestClient from tests.e2e.helpers.server import start_test_server @pytest.mark.e2e @pytest.mark.regression @pytest.mark.process @pytest.mark.search async def test_search_processes_returns_matches( mock_msal, temp_checklist_dir, temp_process_dir, temp_token_cache, ): """Scenario 5.5.5: Search processes with keyword match.""" async with start_test_server( checklist_dir=temp_checklist_dir, process_dir=temp_process_dir, token_cache=temp_token_cache, ) as server_url: client = MCPTestClient(server_url) response = await client.search_processes("review") assert "error" not in response result = response["result"] assert len(result["results"]) > 0 # code-review should be in results names = [r["name"] for r in result["results"]] assert any("review" in name.lower() for name in names) @pytest.mark.e2e @pytest.mark.regression @pytest.mark.process @pytest.mark.search async def test_search_processes_no_matches( mock_msal, temp_checklist_dir, temp_process_dir, temp_token_cache, ): """Scenario 5.5.6: Search processes with no matches.""" async with start_test_server( checklist_dir=temp_checklist_dir, process_dir=temp_process_dir, token_cache=temp_token_cache, ) as server_url: client = MCPTestClient(server_url) response = await client.search_processes("xyznonexistent") assert "error" not in response result = response["result"] assert len(result["results"]) == 0 ``` ### 9.5 Test Organization **Naming Conventions**: - Test files: `test_*.py` - Test functions: `test_<scenario>_<condition>_<expected>` - Fixtures: `<resource>_fixture` or `temp_<resource>` - Helpers: `<action>_<resource>.py` **Test Tags**: ```python @pytest.mark.smoke # Quick critical path tests @pytest.mark.regression # Full regression suite @pytest.mark.e2e # All E2E tests @pytest.mark.slow # Tests >10 seconds @pytest.mark.checklist # Checklist-related tests @pytest.mark.process # Process-related tests @pytest.mark.search # Search functionality tests ``` --- ## 10. Execution Plan ### 10.1 Test Suites | Suite | Purpose | Scenarios | Duration | Frequency | |-------|---------|-----------|----------|-----------| | **Smoke** | Critical paths | 5.2.1-5.4.4 (P0) | ~3 min | Every commit | | **Regression** | Core functionality | All P0 + P1 | ~15 min | Daily | | **Full** | Comprehensive | All scenarios | ~45 min | Weekly | | **Session** | 8-hour validation | SC-003 with mocked time | ~5 min | Pre-release | ### 10.2 Execution Schedule ```mermaid gantt title E2E Test Execution Schedule dateFormat HH:mm section Per Commit Smoke Tests :00:00, 3m section Daily Regression Tests :02:00, 15m section Weekly Full Suite :22:00, 45m ``` ### 10.3 CI/CD Integration **GitHub Actions**: ```yaml name: E2E Tests on: push: branches: [main, 001-mcp-sso-checklist, 003-process-query] pull_request: branches: [main] schedule: - cron: '0 2 * * *' # Daily at 2 AM jobs: e2e-smoke: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - uses: astral-sh/setup-uv@v1 - name: Install dependencies run: uv sync --dev - name: Run smoke tests run: uv run pytest tests/e2e/ -m smoke -v --timeout=120 - uses: actions/upload-artifact@v4 if: always() with: name: e2e-test-results path: test-results/ e2e-regression: runs-on: ubuntu-latest if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch' steps: - uses: actions/checkout@v4 - uses: astral-sh/setup-uv@v1 - name: Install dependencies run: uv sync --dev - name: Run regression tests run: uv run pytest tests/e2e/ -m regression -v --timeout=600 ``` ### 10.4 Execution Commands ```bash # Smoke tests (every commit) uv run pytest tests/e2e/ -m smoke -v # Regression tests (daily) uv run pytest tests/e2e/ -m regression -v # Full suite (weekly) uv run pytest tests/e2e/ -v # Checklist tests only uv run pytest tests/e2e/ -m checklist -v # Process tests only uv run pytest tests/e2e/ -m process -v # Search tests only uv run pytest tests/e2e/ -m search -v # 8-hour session test uv run pytest tests/e2e/scenarios/test_session.py -v # With coverage uv run pytest tests/e2e/ -v --cov=src/sso_mcp_server --cov-report=html ``` --- ## 11. Reporting & Metrics ### 11.1 Test Reports **Report Types**: - **Console Output**: pytest verbose output for CI - **JUnit XML**: For CI integration (`--junitxml=test-results/results.xml`) - **HTML Report**: For detailed local review (`--html=test-results/report.html`) - **Coverage Report**: Code coverage visualization **Report Generation**: ```bash # Generate all reports uv run pytest tests/e2e/ -v \ --junitxml=test-results/results.xml \ --html=test-results/report.html \ --cov=src/sso_mcp_server \ --cov-report=html:test-results/coverage ``` ### 11.2 Key Metrics | Metric | Definition | Target | |--------|------------|--------| | **Pass Rate** | (Passed / Total) × 100 | ≥ 95% | | **Smoke Pass Rate** | Smoke tests passing | 100% | | **Execution Time** | Time to run full suite | ≤ 45 min | | **Flaky Rate** | Intermittent failures | ≤ 2% | ### 11.3 Failure Notifications **CI Failure Handling**: - PR blocked on smoke test failure - GitHub Actions shows failure details - Test artifacts uploaded for debugging --- ## 12. Maintenance & Improvement ### 12.1 Flaky Test Management **Identification**: - Track test history via CI - Monitor tests with >1 retry **Resolution**: 1. Investigate timing issues (add explicit waits) 2. Improve test isolation (ensure cleanup) 3. Document and track in backlog 4. Quarantine if not quickly fixable ### 12.2 Test Maintenance Schedule | Task | Frequency | Owner | |------|-----------|-------| | Review failed tests | Per failure | Developer | | Update test fixtures | When spec changes | Developer | | Review test coverage | Monthly | Team | | Update E2E test plan | Per feature release | Team Lead | ### 12.3 Continuous Improvement **Review Process**: 1. **Per PR**: Ensure new features have E2E coverage 2. **Monthly**: Analyze flaky tests and failures 3. **Quarterly**: Evaluate test strategy effectiveness --- ## 13. Appendices ### 13.1 Glossary | Term | Definition | |------|------------| | **E2E Test** | End-to-end test validating entire system workflow | | **MCP** | Model Context Protocol for AI assistant tool integration | | **PKCE** | Proof Key for Code Exchange (OAuth 2.0 security extension) | | **Smoke Test** | Quick test of critical functionality | | **Fixture** | Fixed test data used for testing | | **Process** | Development procedure document (code-review, deployment, etc.) | | **Checklist** | Quality standard document with verification items | ### 13.2 References - **Architecture**: `docs/architecture.md` - **Standards**: `docs/standards.md` - **Ground Rules**: `memory/ground-rules.md` - **Feature Specs**: - `specs/001-mcp-sso-checklist/spec.md` - Checklist feature - `specs/003-process-query/spec.md` - Process Query feature - **MCP Tool Contracts**: `specs/001-mcp-sso-checklist/contracts/mcp-tools.json` ### 13.3 Test Scenario Catalog **Scenario Coverage by Priority**: | Priority | Scenarios | Coverage | |----------|-----------|----------| | P0 (Critical) | 9 | Auth + Checklist Tools + Process Tools | | P1 (High) | 7 | Session + Dynamic Discovery + Search | | P2 (Medium) | 9 | Configuration + Edge Cases | | P3 (Low) | 3 | Rare Edge Cases | | **Total** | **28** | **100%** | **Scenario Coverage by Feature**: | Feature | Scenarios | Priority Range | |---------|-----------|----------------| | Authentication | 5.2.1, 5.2.2, 5.5.1, 5.5.2 | P0-P1 | | Checklists (001) | 5.3.1-5.3.3, 5.5.3, 5.8.1 | P0-P1 | | Processes (003) | 5.4.1-5.4.4, 5.5.4-5.5.7, 5.8.2-5.8.3 | P0-P1 | | Configuration | 5.6.1-5.6.9, 5.7.1-5.7.4 | P2-P3 | ### 13.4 Success Criteria Mapping **Checklist Feature (001-mcp-sso-checklist)**: | Success Criteria | E2E Scenario | |------------------|--------------| | SC-001: Auth <30s | 5.8.1: Full Workflow | | SC-002: Retrieval <2s | 5.3.1: Get Checklist Valid | | SC-003: 8-hour session | 5.5.2: Token Refresh | | SC-005: List all checklists | 5.3.3: List Checklists | | SC-006: Actionable errors | 5.3.2, 5.7.1, 5.7.2 | | SC-007: Server start <5s | 5.2.1, 5.2.2 | **Process Feature (003-process-query)**: | Success Criteria | E2E Scenario | |------------------|--------------| | SC-001: Process retrieval <2s | 5.4.1: Get Process Valid | | SC-002: Process listing <1s | 5.4.4: List Processes | | SC-003: Search <3s | 5.5.5: Search Processes | | SC-005: List all processes | 5.4.4: List Processes | | SC-006: Search returns matches | 5.5.5: Search with Matches | | SC-007: Actionable errors | 5.4.3: Process Not Found | --- **END OF E2E TEST PLAN DOCUMENT** --- ## Maintenance Notes This document should be: - **Reviewed monthly** by development team - **Updated** when new features are added - **Referenced** when implementing E2E tests - **Shared** with QA for test validation - **Version controlled** alongside test code For questions or suggestions, contact the Development Team.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/DauQuangThanh/sso-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

e2e-test-plan.md•61.6 KiB