# End-to-End Test Plan: SSO MCP Server
**Version**: 1.2 | **Date**: 2025-12-15 | **Status**: Active
**Maintained by**: Development Team | **Last Reviewed**: 2025-12-15
**Note**: This document defines the end-to-end testing strategy, test scenarios, and execution plan for the SSO MCP Server. It ensures comprehensive coverage of critical user journeys and system integration points for all server capabilities (checklists, processes, and future tools).
---
## Document Control
| Version | Date | Author | Changes |
|---------|------|--------|---------|
| 1.0 | 2025-12-11 | Development Team | Initial E2E test plan |
| 1.1 | 2025-12-13 | Development Team | Add Process Query feature (003-process-query) test scenarios |
| 1.2 | 2025-12-15 | Development Team | Update title to reflect multi-function server |
**Related Documents**:
- Architecture: `docs/architecture.md`
- Ground Rules: `memory/ground-rules.md`
- Feature Specifications:
- `specs/001-mcp-sso-checklist/spec.md` - Checklist feature
- `specs/003-process-query/spec.md` - Process Query feature
- Standards: `docs/standards.md`
---
## Table of Contents
1. [Introduction](#1-introduction)
2. [Test Strategy](#2-test-strategy)
3. [Test Scope](#3-test-scope)
4. [User Journeys](#4-user-journeys)
5. [Test Scenarios](#5-test-scenarios)
6. [Test Data Management](#6-test-data-management)
7. [Test Environments](#7-test-environments)
8. [Test Framework & Tools](#8-test-framework--tools)
9. [Test Architecture](#9-test-architecture)
10. [Execution Plan](#10-execution-plan)
11. [Reporting & Metrics](#11-reporting--metrics)
12. [Maintenance & Improvement](#12-maintenance--improvement)
13. [Appendices](#13-appendices)
---
## 1. Introduction
### 1.1 Purpose
This document establishes the comprehensive end-to-end (E2E) testing strategy for the SSO MCP Server. E2E tests validate that the entire system works correctly as an integrated whole, from MCP protocol communication through Azure authentication to resource retrieval (checklists, processes, and future tools) from local files.
### 1.2 Goals
- **Validate critical user journeys** across Azure SSO authentication, MCP tool operations for checklists and processes
- **Ensure system integration** between MCP server, Azure Entra ID, and local file system
- **Verify business workflows** function correctly in developer environments
- **Detect integration issues** early, especially around OAuth 2.0 PKCE flow
- **Provide confidence** for releases through automated regression testing
- **Document expected behavior** for AI assistant integrations (GitHub Copilot, Claude Code)
### 1.3 Audience
- Developers implementing and maintaining the MCP server
- QA Engineers validating authentication, checklist, and process tool functionality
- DevOps Engineers setting up test infrastructure and CI/CD
- Users configuring MCP server in their development environment
### 1.4 System Overview
**Product Description**: Local MCP server providing software development checklists and process documentation to AI coding assistants with Azure Entra ID SSO authentication.
**Key Components** (from architecture.md):
- **MCP Server**: FastMCP with HTTP Streamable transport (port 8080)
- **Auth Module**: MSAL Python for OAuth 2.0 PKCE authentication
- **Checklist Module**: Local file system with YAML frontmatter parsing
- **Process Module**: Local file system with YAML frontmatter parsing + keyword search
- **Token Cache**: Encrypted local storage via msal-extensions (~/.sso-mcp-server/token_cache.bin)
- **External Integration**: Azure Entra ID for authentication
**Reference Architecture**: See `docs/architecture.md` for detailed system architecture.
---
## 2. Test Strategy
### 2.1 Testing Approach
**E2E Test Philosophy**:
E2E tests in this project focus on:
- ✅ **Critical user paths**: SSO authentication, checklist retrieval, checklist listing, process retrieval, process listing, process search
- ✅ **System integration points**: MCP protocol ↔ Azure Entra ID ↔ File system (checklists + processes)
- ✅ **Business workflows**: Complete developer interaction flows
- ✅ **User-visible behavior**: MCP tool responses and error messages
- ❌ **NOT unit-level logic** (covered by unit tests)
- ❌ **NOT component-level details** (covered by integration tests)
**Test Pyramid Position**:
```
/\ ← E2E Tests (Few, slow, system integration)
/ \ ← Integration Tests (More, faster, module integration)
/____\ ← Unit Tests (Many, fast, focused)
```
E2E tests provide the **highest confidence** that authentication, checklist, and process tool calls work end-to-end.
### 2.2 Testing Types
**Primary E2E Testing**:
- **API-driven tests**: Validate MCP protocol interactions via HTTP
- **Authentication flow tests**: Test complete OAuth 2.0 PKCE browser-based login
- **Checklist file system validation**: Verify checklist reading and caching
- **Process file system validation**: Verify process reading, listing, and search
- **Token lifecycle tests**: Validate token persistence and refresh
**Supplementary Testing** (within E2E scope):
- **Session duration**: 8-hour session without re-authentication
- **Error handling**: Actionable error messages for all failure scenarios
- **Configuration**: Environment variable and MCP config validation
- **Search functionality**: Process keyword search with relevance ranking
### 2.3 Test Levels
| Level | Focus | Examples | Execution Frequency |
|-------|-------|----------|---------------------|
| **Smoke Tests** | Critical happy paths | Auth + get_checklist + get_process | Every commit |
| **Regression Tests** | Core features | All MCP tools + token refresh + process search | Daily/nightly |
| **Full Suite** | Comprehensive coverage | All scenarios + edge cases | Weekly/pre-release |
| **8-Hour Session Test** | Long session validation | Mocked time progression | Pre-release |
### 2.4 Entry and Exit Criteria
**Entry Criteria** (when to run E2E tests):
- [ ] Unit tests pass ≥ 90%
- [ ] Integration tests pass ≥ 85%
- [ ] Test environment is available (Azure test tenant configured)
- [ ] Test checklist and process files prepared
**Exit Criteria** (when E2E test run is complete):
- [ ] Smoke tests pass 100%
- [ ] Regression tests pass ≥ 95%
- [ ] No P0 (critical) failures
- [ ] All test reports generated
- [ ] Failed tests triaged and documented
---
## 3. Test Scope
### 3.1 In Scope
**✅ What IS Covered**:
1. **Authentication Flows**:
- Browser-based Azure SSO login (OAuth 2.0 PKCE)
- Token persistence across server restarts
- Silent re-authentication with valid cached tokens
- Token refresh before expiration (<5 minutes remaining)
- Re-authentication when refresh fails
2. **MCP Checklist Tool Operations**:
- `get_checklist` tool with valid checklist name
- `get_checklist` tool with invalid checklist name (error handling)
- `list_checklists` tool returning all available checklists
- Tool calls without authentication (auth trigger)
3. **MCP Process Tool Operations** (003-process-query):
- `get_process` tool with valid process name
- `get_process` tool with case-insensitive name matching
- `get_process` tool with invalid process name (error handling)
- `list_processes` tool returning all available processes
- `search_processes` tool with keyword search
- Search result relevance ranking (title > description > content)
- Search result limit (max 50 results)
- Empty search results handling
4. **System Integration**:
- MCP HTTP Streamable protocol communication
- Azure Entra ID token exchange
- Local file system checklist reading
- Local file system process reading (separate directory)
- YAML frontmatter metadata extraction
5. **Configuration Validation**:
- Environment variable loading
- Port configuration (default 8080, custom via MCP_PORT)
- Checklist directory configuration (CHECKLIST_DIR)
- Process directory configuration (PROCESS_DIR, default: ./processes)
- Azure credential validation
### 3.2 Out of Scope
**❌ What IS NOT Covered** (handled by other test types):
1. **Unit-Level Testing**:
- Individual parser function testing → **unit tests**
- MSAL library internals → **unit tests**
- Search engine relevance algorithm → **unit tests**
2. **Component Integration**:
- Auth module internals → **integration tests**
- Checklist service internals → **integration tests**
- Process service internals → **integration tests**
3. **Non-Functional Testing**:
- Load testing with many concurrent users → **performance tests**
- Security penetration testing → **security tests**
4. **Third-Party Services**:
- Azure Entra ID internal behavior
- MSAL library correctness
### 3.3 Testing Boundaries
```mermaid
graph LR
A[AI Assistant<br/>Copilot/Claude] -->|MCP HTTP Request| B[MCP Server<br/>Port 8080]
B --> C[Auth Module<br/>MSAL]
C -->|OAuth 2.0 PKCE| D[Azure Entra ID<br/>Mocked in Tests]
B --> E[Checklist Module]
B --> F[Process Module]
E -->|Read Files| G[Checklist Dir<br/>Test Fixtures]
F -->|Read Files| H[Process Dir<br/>Test Fixtures]
C -->|Read/Write| I[Token Cache<br/>Temp Directory]
style A fill:#e1f5ff
style D fill:#fff3cd
style G fill:#d4edda
style H fill:#d4edda
style I fill:#d4edda
```
**Test Entry Point**: MCP HTTP endpoint (localhost:8080/mcp)
**Test Exit Point**: MCP response validation + file system state
**External Dependencies**: Azure Entra ID (mocked in automated tests, real in manual tests)
---
## 4. User Journeys
### 4.1 User Personas
| Persona | Role | Primary Goals |
|---------|------|---------------|
| **Developer** | Software developer using AI assistant | Access checklists and processes via AI assistant, follow quality standards and procedures |
| **First-Time User** | New developer setting up MCP server | Configure server, complete SSO, verify connection |
| **Returning User** | Developer with existing session | Seamlessly resume work without re-authentication |
### 4.2 Critical User Journeys - Authentication
#### Journey 1: First-Time Authentication - Priority: P0
**Persona**: First-Time User
**Business Value**: Enable secure access to checklists and processes; foundation for all other functionality
**Frequency**: Once per machine/profile, then on token expiration
**Happy Path**:
```mermaid
graph LR
A[Start Server] --> B[No Cached Tokens]
B --> C[Browser Opens<br/>Azure Login]
C --> D[Enter Credentials]
D --> E[Auth Success]
E --> F[Token Cached]
F --> G[Server Ready]
style A fill:#d4edda
style G fill:#d4edda
```
**Steps**:
1. **Step 1**: User starts MCP server (`uv run sso-mcp-server`)
- Expected: Server begins initialization
2. **Step 2**: System detects no cached tokens
- Expected: Browser opens automatically with Azure login page
3. **Step 3**: User enters Azure Entra ID credentials
- Expected: Azure validates credentials, redirects with auth code
4. **Step 4**: Server exchanges code for tokens
- Expected: Access and refresh tokens obtained, encrypted and cached
5. **Step 5**: Server becomes ready
- Expected: Log message "Server ready", HTTP endpoint available
**Alternative Paths**:
- **Alt 1**: User cancels browser authentication → Server logs error, allows retry
- **Alt 2**: Network interruption during auth → Clear error message, retry option
**Failure Scenarios**:
- **Fail 1**: Invalid credentials → Azure error displayed, no tokens cached
- **Fail 2**: Azure service unavailable → Timeout with actionable error message
---
#### Journey 4: Returning User (Silent Re-auth) - Priority: P1
**Persona**: Returning User
**Business Value**: Seamless experience, no repeated logins
**Frequency**: Every server restart
**Happy Path**:
```mermaid
graph LR
A[Start Server] --> B[Load Cached<br/>Tokens]
B --> C[Validate Token<br/>Expiry]
C --> D{Token Valid?}
D -->|Yes| E[Server Ready<br/>No Browser]
D -->|Expired| F[Silent Refresh<br/>via MSAL]
F -->|Success| E
F -->|Fail| G[Browser Auth<br/>Required]
style A fill:#d4edda
style E fill:#d4edda
```
---
### 4.3 Critical User Journeys - Checklists
#### Journey 2: Retrieve Checklist - Priority: P0
**Persona**: Developer
**Business Value**: Core value proposition - accessing quality checklists
**Frequency**: Multiple times per day
**Happy Path**:
```mermaid
graph LR
A[AI Assistant<br/>Tool Call] --> B[MCP Server<br/>Receives Request]
B --> C[Auth Check<br/>Valid Token]
C --> D[Read Checklist<br/>File]
D --> E[Parse Frontmatter]
E --> F[Return Content<br/>to AI Assistant]
style A fill:#d4edda
style F fill:#d4edda
```
**Steps**:
1. **Step 1**: AI assistant calls `get_checklist` with name "coding"
- Expected: MCP server receives HTTP request
2. **Step 2**: Server validates authentication
- Expected: Token valid, request proceeds
3. **Step 3**: Server reads `checklists/coding.md`
- Expected: File content loaded
4. **Step 4**: Server parses YAML frontmatter
- Expected: Name and description extracted
5. **Step 5**: Server returns checklist content
- Expected: JSON response with name, description, content
**Failure Scenarios**:
- **Fail 1**: Checklist not found → Error with available checklists list
- **Fail 2**: Not authenticated → Auth middleware triggers authentication
---
#### Journey 3: List Available Checklists - Priority: P1
**Persona**: Developer
**Business Value**: Discovery of available quality standards
**Frequency**: Occasionally (when exploring available checklists)
**Happy Path**:
```mermaid
graph LR
A[AI Assistant<br/>Tool Call] --> B[MCP Server]
B --> C[Auth Check]
C --> D[Scan Directory]
D --> E[Parse All Frontmatter]
E --> F[Return Metadata<br/>Array]
style A fill:#d4edda
style F fill:#d4edda
```
**Steps**:
1. **Step 1**: AI assistant calls `list_checklists`
2. **Step 2**: Server validates authentication
3. **Step 3**: Server scans checklist directory for .md files
4. **Step 4**: Server parses frontmatter from each file
5. **Step 5**: Server returns array of {name, description} with count
---
### 4.4 Critical User Journeys - Processes (003-process-query)
#### Journey 5: Retrieve Process - Priority: P0
**Persona**: Developer
**Business Value**: Access to development process documentation for following correct procedures
**Frequency**: Multiple times per day during development activities
**Happy Path**:
```mermaid
graph LR
A[AI Assistant<br/>Tool Call] --> B[MCP Server<br/>Receives Request]
B --> C[Auth Check<br/>Valid Token]
C --> D[Read Process<br/>File]
D --> E[Parse Frontmatter]
E --> F[Return Content<br/>to AI Assistant]
style A fill:#d4edda
style F fill:#d4edda
```
**Steps**:
1. **Step 1**: AI assistant calls `get_process` with name "code-review"
- Expected: MCP server receives HTTP request
2. **Step 2**: Server validates authentication
- Expected: Token valid, request proceeds
3. **Step 3**: Server reads `processes/code-review.md` (case-insensitive matching)
- Expected: File content loaded
4. **Step 4**: Server parses YAML frontmatter
- Expected: Name and description extracted
5. **Step 5**: Server returns process content
- Expected: JSON response with name, description, content
**Failure Scenarios**:
- **Fail 1**: Process not found → Error with available processes list (FR-015)
- **Fail 2**: Not authenticated → Auth middleware triggers authentication (FR-013)
---
#### Journey 6: List Available Processes - Priority: P1
**Persona**: Developer
**Business Value**: Discovery of available development procedures
**Frequency**: Occasionally (when exploring available processes)
**Happy Path**:
```mermaid
graph LR
A[AI Assistant<br/>Tool Call] --> B[MCP Server]
B --> C[Auth Check]
C --> D[Scan Process<br/>Directory]
D --> E[Parse All Frontmatter]
E --> F[Return Metadata<br/>Array]
style A fill:#d4edda
style F fill:#d4edda
```
**Steps**:
1. **Step 1**: AI assistant calls `list_processes`
2. **Step 2**: Server validates authentication
3. **Step 3**: Server scans process directory for .md files
4. **Step 4**: Server parses frontmatter from each file
5. **Step 5**: Server returns array of {name, description} with count
---
#### Journey 7: Search Processes by Keyword - Priority: P1
**Persona**: Developer
**Business Value**: Find relevant processes without knowing exact names
**Frequency**: Often (when looking for procedures related to a topic)
**Happy Path**:
```mermaid
graph LR
A[AI Assistant<br/>Tool Call] --> B[MCP Server]
B --> C[Auth Check]
C --> D[Load All<br/>Processes]
D --> E[Search Keyword<br/>in name/desc/content]
E --> F[Rank by<br/>Relevance]
F --> G[Return Top 50<br/>Results]
style A fill:#d4edda
style G fill:#d4edda
```
**Steps**:
1. **Step 1**: AI assistant calls `search_processes` with keyword "deployment"
2. **Step 2**: Server validates authentication
3. **Step 3**: Server loads all process files
4. **Step 4**: Server searches keyword across name, description, and content (FR-010)
5. **Step 5**: Server ranks results by relevance (title matches > content matches) (FR-011)
6. **Step 6**: Server returns up to 50 matching processes with metadata (FR-012a)
**Failure Scenarios**:
- **Fail 1**: No matches found → Clear message "No processes matched the search criteria"
- **Fail 2**: Not authenticated → Auth middleware triggers authentication
---
### 4.5 Journey Priority Matrix
| Priority | Definition | Journeys | Test Frequency |
|----------|------------|----------|----------------|
| **P0 - Critical** | Core functionality, blocks all usage | Auth, Get Checklist, Get Process | Every build |
| **P1 - High** | Important features, user convenience | List Checklists, List Processes, Search Processes, Silent Re-auth | Daily |
| **P2 - Medium** | Secondary features | Token refresh, Error handling, Dynamic discovery | Weekly |
| **P3 - Low** | Edge cases | Malformed files, Concurrent requests, Empty directories | Pre-release |
---
## 5. Test Scenarios
### 5.1 Scenario Structure
Each test scenario follows this structure:
```
Scenario: [Descriptive name]
Priority: [P0/P1/P2/P3]
Tags: [smoke, regression, critical, etc.]
Given [precondition/setup]
When [user action]
Then [expected result]
And [additional validation]
```
### 5.2 Critical Scenarios (P0) - Authentication
#### Scenario 5.2.1: Server Start with No Cached Tokens
**Priority**: P0
**Tags**: `smoke`, `authentication`, `critical`
**User Journey**: First-Time Authentication
**Estimated Duration**: 5-10 seconds (automated), 30 seconds (with browser)
**Preconditions**:
- Token cache does not exist or is empty
- Azure credentials (CLIENT_ID, TENANT_ID) configured
- CHECKLIST_DIR and PROCESS_DIR point to valid directories
**Test Steps**:
```gherkin
Given no token cache exists at ~/.sso-mcp-server/token_cache.bin
And environment variables AZURE_CLIENT_ID, AZURE_TENANT_ID, CHECKLIST_DIR are set
When the MCP server starts
Then server logs "No cached tokens, authentication required"
And server attempts to open system browser (mocked in automated tests)
And server listens on configured port (default 8080)
```
**Expected Results**:
- ✓ Server starts within 5 seconds (SC-007)
- ✓ Browser authentication triggered (or mocked)
- ✓ HTTP endpoint becomes available
- ✓ Log messages indicate authentication status
---
#### Scenario 5.2.2: Server Start with Valid Cached Tokens
**Priority**: P0
**Tags**: `smoke`, `authentication`, `critical`
**User Journey**: Returning User (Silent Re-auth)
**Preconditions**:
- Token cache exists with valid (non-expired) tokens
- Environment variables configured
**Test Steps**:
```gherkin
Given token cache exists with tokens expiring in 1 hour
And environment variables are properly configured
When the MCP server starts
Then server loads cached tokens silently
And server logs "Silent authentication successful"
And server becomes ready without opening browser
And server responds to MCP requests
```
**Expected Results**:
- ✓ No browser window opens
- ✓ Server ready in <5 seconds
- ✓ Token loaded from encrypted cache
- ✓ MCP tools respond to requests
---
### 5.3 Critical Scenarios (P0) - Checklists
#### Scenario 5.3.1: Get Checklist with Valid Name
**Priority**: P0
**Tags**: `smoke`, `mcp-tool`, `critical`, `checklist`
**User Journey**: Retrieve Checklist
**Preconditions**:
- Server is authenticated
- Checklist file `coding.md` exists in CHECKLIST_DIR
**Test Steps**:
```gherkin
Given server is authenticated and ready
And checklist file "coding.md" exists with YAML frontmatter
When client sends MCP request to get_checklist with name="coding"
Then server returns HTTP 200 response
And response contains name="coding"
And response contains description from frontmatter
And response contains markdown content from file
And request completes in <2 seconds (SC-002)
```
**Expected Results**:
- ✓ MCP response with correct structure
- ✓ Name matches requested checklist
- ✓ Description extracted from YAML frontmatter
- ✓ Content contains full markdown (excluding frontmatter)
- ✓ Response time <2 seconds
**Sample Test Data**:
```yaml
# checklists/coding.md
---
name: Coding Standards Checklist
description: Quality checklist for code implementation
---
# Coding Standards
## Naming
- [ ] Variables use descriptive names
```
---
#### Scenario 5.3.2: Get Checklist with Invalid Name
**Priority**: P0
**Tags**: `smoke`, `mcp-tool`, `negative`, `critical`, `checklist`
**User Journey**: Retrieve Checklist (error path)
**Preconditions**:
- Server is authenticated
- Checklist "nonexistent" does not exist
**Test Steps**:
```gherkin
Given server is authenticated and ready
And no checklist named "nonexistent" exists
When client sends MCP request to get_checklist with name="nonexistent"
Then server returns MCP error response
And error code is "CHECKLIST_NOT_FOUND"
And error message contains "nonexistent"
And error message lists available checklists
```
**Expected Results**:
- ✓ Error response (not success)
- ✓ Error code matches contract: CHECKLIST_NOT_FOUND
- ✓ Message is actionable (lists alternatives)
---
#### Scenario 5.3.3: List Checklists Returns All Available
**Priority**: P0
**Tags**: `smoke`, `mcp-tool`, `critical`, `checklist`
**User Journey**: List Available Checklists
**Preconditions**:
- Server is authenticated
- Multiple checklist files exist in CHECKLIST_DIR
**Test Steps**:
```gherkin
Given server is authenticated and ready
And checklist directory contains: coding.md, architecture.md, detailed-design.md
When client sends MCP request to list_checklists
Then server returns HTTP 200 response
And response.checklists is array of length 3
And each checklist has name and description fields
And response.count equals 3
And request completes in <1 second (SC-003)
```
**Expected Results**:
- ✓ All 3 checklists returned
- ✓ Each has name from frontmatter or filename
- ✓ Each has description (may be null if not in frontmatter)
- ✓ Count matches array length
- ✓ Response time <1 second
---
### 5.4 Critical Scenarios (P0) - Processes
#### Scenario 5.4.1: Get Process with Valid Name
**Priority**: P0
**Tags**: `smoke`, `mcp-tool`, `critical`, `process`
**User Journey**: Retrieve Process
**Feature**: 003-process-query
**Preconditions**:
- Server is authenticated
- Process file `code-review.md` exists in PROCESS_DIR
**Test Steps**:
```gherkin
Given server is authenticated and ready
And process file "code-review.md" exists with YAML frontmatter
When client sends MCP request to get_process with name="code-review"
Then server returns HTTP 200 response
And response contains name="code-review"
And response contains description from frontmatter
And response contains markdown content from file
And request completes in <2 seconds (process SC-001)
```
**Expected Results**:
- ✓ MCP response with correct structure
- ✓ Name matches requested process
- ✓ Description extracted from YAML frontmatter
- ✓ Content contains full markdown (excluding frontmatter)
- ✓ Response time <2 seconds
**Sample Test Data**:
```yaml
# processes/code-review.md
---
name: Code Review Process
description: Standard procedure for reviewing code changes
---
# Code Review Process
## Prerequisites
- [ ] PR has description
- [ ] Tests are passing
```
---
#### Scenario 5.4.2: Get Process with Case-Insensitive Matching
**Priority**: P0
**Tags**: `smoke`, `mcp-tool`, `critical`, `process`
**User Journey**: Retrieve Process
**Feature**: 003-process-query (FR-004)
**Preconditions**:
- Server is authenticated
- Process file `Code-Review.md` exists in PROCESS_DIR
**Test Steps**:
```gherkin
Given server is authenticated and ready
And process file "Code-Review.md" exists
When client sends MCP request to get_process with name="code-review"
Then server returns HTTP 200 response
And response contains the process content
And case-insensitive matching succeeds
```
**Expected Results**:
- ✓ Process found regardless of case in request
- ✓ Matching is case-insensitive per FR-004
---
#### Scenario 5.4.3: Get Process with Invalid Name
**Priority**: P0
**Tags**: `smoke`, `mcp-tool`, `negative`, `critical`, `process`
**User Journey**: Retrieve Process (error path)
**Feature**: 003-process-query (FR-014, FR-015)
**Preconditions**:
- Server is authenticated
- Process "nonexistent" does not exist
**Test Steps**:
```gherkin
Given server is authenticated and ready
And no process named "nonexistent" exists
When client sends MCP request to get_process with name="nonexistent"
Then server returns MCP error response
And error code is "PROCESS_NOT_FOUND"
And error message contains "nonexistent"
And error message lists available processes (FR-015)
```
**Expected Results**:
- ✓ Error response (not success)
- ✓ Error code: PROCESS_NOT_FOUND
- ✓ Message is actionable (lists available processes)
---
#### Scenario 5.4.4: List Processes Returns All Available
**Priority**: P0
**Tags**: `smoke`, `mcp-tool`, `critical`, `process`
**User Journey**: List Available Processes
**Feature**: 003-process-query (FR-006)
**Preconditions**:
- Server is authenticated
- Multiple process files exist in PROCESS_DIR
**Test Steps**:
```gherkin
Given server is authenticated and ready
And process directory contains: code-review.md, deployment.md, incident-response.md
When client sends MCP request to list_processes
Then server returns HTTP 200 response
And response.processes is array of length 3
And each process has name and description fields
And response.count equals 3
And request completes in <1 second (process SC-002)
```
**Expected Results**:
- ✓ All 3 processes returned
- ✓ Each has name from frontmatter or filename
- ✓ Each has description (may be null if not in frontmatter)
- ✓ Count matches array length
- ✓ Response time <1 second
---
### 5.5 High Priority Scenarios (P1)
#### Scenario 5.5.1: Tool Call Without Authentication
**Priority**: P1
**Tags**: `authentication`, `regression`
**User Journey**: N/A (error handling)
**Test Steps**:
```gherkin
Given server started but not yet authenticated
When client sends MCP request to get_checklist with name="coding"
Then server returns MCP error response
And error code is "NOT_AUTHENTICATED"
And error message provides guidance for authentication
```
**Expected Results**:
- ✓ Clear NOT_AUTHENTICATED error
- ✓ Actionable guidance in message
---
#### Scenario 5.5.2: Token Refresh Before Expiration
**Priority**: P1
**Tags**: `authentication`, `regression`, `session`
**User Journey**: 8-hour session
**Test Steps** (mocked time):
```gherkin
Given server is authenticated with token expiring in 4 minutes
When server checks token before tool call
Then server proactively refreshes token via MSAL
And new token is cached
And tool call proceeds without error
And no browser authentication required
```
**Expected Results**:
- ✓ Token refreshed silently when <5 min remaining
- ✓ No user interaction required
- ✓ Tool calls continue working
---
#### Scenario 5.5.3: Dynamic Checklist Discovery
**Priority**: P1
**Tags**: `mcp-tool`, `regression`, `checklist`
**User Journey**: List Available Checklists
**Test Steps**:
```gherkin
Given server is running and authenticated
And initial list_checklists returns 3 checklists
When new file "custom.md" is added to CHECKLIST_DIR
And client sends list_checklists request
Then response includes 4 checklists (including "custom")
And no server restart was required (FR-010)
```
---
#### Scenario 5.5.4: Dynamic Process Discovery
**Priority**: P1
**Tags**: `mcp-tool`, `regression`, `process`
**User Journey**: List Available Processes
**Feature**: 003-process-query (FR-007)
**Test Steps**:
```gherkin
Given server is running and authenticated
And initial list_processes returns 3 processes
When new file "new-process.md" is added to PROCESS_DIR
And client sends list_processes request
Then response includes 4 processes (including "new-process")
And no server restart was required (FR-007)
```
---
#### Scenario 5.5.5: Search Processes with Keyword Match
**Priority**: P1
**Tags**: `mcp-tool`, `regression`, `process`, `search`
**User Journey**: Search Processes by Keyword
**Feature**: 003-process-query (FR-009, FR-010, FR-011)
**Preconditions**:
- Server is authenticated
- Multiple process files exist with various content
**Test Steps**:
```gherkin
Given server is authenticated and ready
And process "deployment.md" contains "production deployment" in content
And process "release.md" has "deployment" in title
When client sends MCP request to search_processes with keyword="deployment"
Then server returns search results
And results include both "deployment" and "release" processes
And "release" (title match) ranks higher than "deployment" (content match) (FR-011)
And results include name, description for each match
And request completes in <3 seconds (process SC-003)
```
**Expected Results**:
- ✓ All matching processes returned
- ✓ Title matches ranked higher than content matches
- ✓ Response time <3 seconds
---
#### Scenario 5.5.6: Search Processes with No Matches
**Priority**: P1
**Tags**: `mcp-tool`, `regression`, `process`, `search`, `negative`
**User Journey**: Search Processes by Keyword (no results)
**Feature**: 003-process-query
**Test Steps**:
```gherkin
Given server is authenticated and ready
And no process contains the keyword "xyznonexistent"
When client sends MCP request to search_processes with keyword="xyznonexistent"
Then server returns empty results array
And response message indicates no matches found
```
---
#### Scenario 5.5.7: Search Processes Respects Result Limit
**Priority**: P1
**Tags**: `mcp-tool`, `regression`, `process`, `search`
**User Journey**: Search Processes by Keyword
**Feature**: 003-process-query (FR-012a)
**Preconditions**:
- Server is authenticated
- More than 50 process files exist with matching content
**Test Steps**:
```gherkin
Given server is authenticated and ready
And process directory contains 100 files all containing "process"
When client sends MCP request to search_processes with keyword="process"
Then server returns at most 50 results (FR-012a)
And results are ordered by relevance
```
---
### 5.6 Medium Priority Scenarios (P2)
| Scenario ID | Scenario Name | Tags | Journey |
|-------------|---------------|------|---------|
| 5.6.1 | Port Conflict Detection | `configuration`, `negative` | Setup |
| 5.6.2 | Custom Port Configuration | `configuration` | Setup |
| 5.6.3 | Missing CHECKLIST_DIR | `configuration`, `negative` | Setup |
| 5.6.4 | Missing PROCESS_DIR Uses Default | `configuration`, `process` | Setup |
| 5.6.5 | Malformed Checklist Frontmatter | `error-handling`, `checklist` | Get Checklist |
| 5.6.6 | Malformed Process Frontmatter | `error-handling`, `process` | Get Process |
| 5.6.7 | Empty Checklist Directory | `edge-case`, `checklist` | List Checklists |
| 5.6.8 | Empty Process Directory | `edge-case`, `process` | List Processes |
| 5.6.9 | Process Search Partial Matching | `process`, `search` | Search Processes |
---
#### Scenario 5.6.4: Missing PROCESS_DIR Uses Default
**Priority**: P2
**Tags**: `configuration`, `process`
**Feature**: 003-process-query (FR-017)
**Test Steps**:
```gherkin
Given PROCESS_DIR environment variable is not set
And ./processes directory exists with process files
When server starts
Then server uses ./processes as default directory (FR-017)
And list_processes returns processes from default directory
```
---
#### Scenario 5.6.6: Malformed Process Frontmatter Handling
**Priority**: P2
**Tags**: `error-handling`, `process`, `edge-case`
**Feature**: 003-process-query
**Test Steps**:
```gherkin
Given process file "malformed.md" has no YAML frontmatter
When client requests get_process with name="malformed"
Then server returns process content
And name defaults to filename "malformed"
And description is null or empty
```
---
#### Scenario 5.6.8: Empty Process Directory
**Priority**: P2
**Tags**: `edge-case`, `process`
**Feature**: 003-process-query
**Test Steps**:
```gherkin
Given process directory exists but is empty
When client sends list_processes request
Then server returns empty array with count=0
And response includes clear message about empty directory
```
---
#### Scenario 5.6.9: Process Search Partial Matching
**Priority**: P2
**Tags**: `process`, `search`
**Feature**: 003-process-query (FR-012)
**Test Steps**:
```gherkin
Given process "deployment.md" exists with content "production deployment steps"
When client searches with keyword="deploy"
Then "deployment.md" is included in results (partial match)
And FR-012 partial keyword matching is validated
```
---
### 5.7 Edge Cases & Negative Scenarios
#### Scenario 5.7.1: Server Start with Port Already in Use
**Priority**: P2
**Tags**: `negative`, `configuration`
**Test Steps**:
```gherkin
Given port 8080 is already in use by another process
When MCP server attempts to start on port 8080
Then server fails with clear error message
And error message indicates port is in use
And error message suggests using MCP_PORT environment variable
```
---
#### Scenario 5.7.2: Checklist File with Invalid Encoding
**Priority**: P3
**Tags**: `negative`, `edge-case`, `checklist`
**Test Steps**:
```gherkin
Given checklist file "broken.md" has invalid UTF-8 encoding
When client requests get_checklist with name="broken"
Then server returns FILE_READ_ERROR
And error message indicates encoding issue
And other checklists remain accessible
```
---
#### Scenario 5.7.3: Process File with Invalid Encoding
**Priority**: P3
**Tags**: `negative`, `edge-case`, `process`
**Test Steps**:
```gherkin
Given process file "broken.md" has invalid UTF-8 encoding
When client requests get_process with name="broken"
Then server returns FILE_READ_ERROR
And error message indicates encoding issue
And other processes remain accessible
```
---
#### Scenario 5.7.4: Process Directory Does Not Exist
**Priority**: P2
**Tags**: `negative`, `configuration`, `process`
**Feature**: 003-process-query
**Test Steps**:
```gherkin
Given PROCESS_DIR points to non-existent directory "/invalid/path"
When server starts or list_processes is called
Then server returns clear error message
And error indicates directory is missing or not configured
```
---
### 5.8 Cross-Feature Integration Scenarios
#### Scenario 5.8.1: Full Workflow - Auth to Checklist Retrieval
**Priority**: P0
**Tags**: `integration`, `smoke`, `critical`, `checklist`
**Features Involved**: Authentication, MCP Server, Checklist Module
**Workflow**:
```mermaid
sequenceDiagram
participant Client as AI Client
participant Server as MCP Server
participant Auth as Auth Module
participant Azure as Azure (Mocked)
participant FS as File System
Client->>Server: Start server
Server->>Auth: Check tokens
Auth->>Azure: Authenticate (mocked)
Azure-->>Auth: Token response
Auth-->>Server: Authenticated
Server-->>Client: Ready
Client->>Server: get_checklist("coding")
Server->>Auth: Validate token
Auth-->>Server: Valid
Server->>FS: Read coding.md
FS-->>Server: File content
Server-->>Client: Checklist response
```
**Test Steps**:
```gherkin
Given clean test environment with no cached tokens
And test checklist files in place
When server starts and authenticates (mocked Azure)
And client sends get_checklist("coding") request
Then authentication completes successfully
And checklist content is returned
And entire flow completes in <30 seconds (SC-001)
```
---
#### Scenario 5.8.2: Full Workflow - Auth to Process Retrieval
**Priority**: P0
**Tags**: `integration`, `smoke`, `critical`, `process`
**Features Involved**: Authentication, MCP Server, Process Module
**Feature**: 003-process-query
**Workflow**:
```mermaid
sequenceDiagram
participant Client as AI Client
participant Server as MCP Server
participant Auth as Auth Module
participant Azure as Azure (Mocked)
participant FS as File System
Client->>Server: Start server
Server->>Auth: Check tokens
Auth->>Azure: Authenticate (mocked)
Azure-->>Auth: Token response
Auth-->>Server: Authenticated
Server-->>Client: Ready
Client->>Server: get_process("code-review")
Server->>Auth: Validate token
Auth-->>Server: Valid
Server->>FS: Read code-review.md from PROCESS_DIR
FS-->>Server: File content
Server-->>Client: Process response
```
**Test Steps**:
```gherkin
Given clean test environment with no cached tokens
And test process files in place
When server starts and authenticates (mocked Azure)
And client sends get_process("code-review") request
Then authentication completes successfully
And process content is returned
And entire flow completes in <30 seconds
```
---
#### Scenario 5.8.3: Combined Checklist and Process Operations
**Priority**: P1
**Tags**: `integration`, `regression`
**Features Involved**: Authentication, Checklist Module, Process Module
**Test Steps**:
```gherkin
Given server is authenticated and ready
And checklist and process files exist
When client sends get_checklist("coding")
And client sends get_process("code-review")
And client sends list_checklists
And client sends list_processes
And client sends search_processes with keyword="review"
Then all 5 operations succeed
And checklists and processes are returned from separate directories
And search returns relevant processes
```
---
## 6. Test Data Management
### 6.1 Test Data Strategy
**Approach**: Fixture-based with temp directories for isolation
**Principles**:
- **Isolation**: Each test uses its own temp CHECKLIST_DIR, PROCESS_DIR, and TOKEN_CACHE
- **Repeatability**: Fixed fixture files produce consistent results
- **Cleanup**: Temp directories deleted after test completion
- **Privacy**: No real Azure credentials in test data
- **Realism**: Test checklists and processes mimic production format
### 6.2 Test Data Types
| Data Type | Source | Storage | Lifecycle |
|-----------|--------|---------|-----------|
| **Checklist Files** | Fixed fixtures | Temp directory | Created before test, deleted after |
| **Process Files** | Fixed fixtures | Temp directory | Created before test, deleted after |
| **Token Cache** | Mocked MSAL response | Temp file | Created during test, deleted after |
| **Azure Responses** | Mock data | In-memory | Per test |
| **Configuration** | Environment variables | Test setup | Per test session |
### 6.3 Test Data Generation
**Checklist Fixture Factory**:
```python
# tests/fixtures/checklist_factory.py
def create_checklist(
name: str = "test-checklist",
description: str = "Test checklist for E2E tests",
content: str = "- [ ] Test item"
) -> str:
"""Generate checklist file content with frontmatter."""
return f"""---
name: {name}
description: {description}
---
# {name}
{content}
"""
def create_temp_checklist_dir(checklists: list[dict]) -> Path:
"""Create temp directory with checklist fixtures."""
temp_dir = Path(tempfile.mkdtemp())
for checklist in checklists:
filepath = temp_dir / f"{checklist['filename']}.md"
filepath.write_text(create_checklist(**checklist))
return temp_dir
```
**Process Fixture Factory**:
```python
# tests/fixtures/process_factory.py
def create_process(
name: str = "test-process",
description: str = "Test process for E2E tests",
content: str = "## Steps\n1. Step one\n2. Step two"
) -> str:
"""Generate process file content with frontmatter."""
return f"""---
name: {name}
description: {description}
---
# {name}
{content}
"""
def create_temp_process_dir(processes: list[dict]) -> Path:
"""Create temp directory with process fixtures."""
temp_dir = Path(tempfile.mkdtemp())
for process in processes:
filepath = temp_dir / f"{process['filename']}.md"
filepath.write_text(create_process(**process))
return temp_dir
```
### 6.4 Test Data Cleanup
**Cleanup Strategy**:
```python
@pytest.fixture
def temp_checklist_dir():
"""Fixture providing temp checklist directory with cleanup."""
temp_dir = create_temp_checklist_dir([
{"filename": "coding", "name": "Coding Standards"},
{"filename": "architecture", "name": "Architecture Review"},
])
yield temp_dir
shutil.rmtree(temp_dir)
@pytest.fixture
def temp_process_dir():
"""Fixture providing temp process directory with cleanup."""
temp_dir = create_temp_process_dir([
{"filename": "code-review", "name": "Code Review Process"},
{"filename": "deployment", "name": "Deployment Process"},
{"filename": "incident-response", "name": "Incident Response"},
])
yield temp_dir
shutil.rmtree(temp_dir)
@pytest.fixture
def temp_token_cache():
"""Fixture providing temp token cache path with cleanup."""
temp_file = Path(tempfile.mktemp(suffix=".bin"))
yield temp_file
if temp_file.exists():
temp_file.unlink()
```
### 6.5 Sensitive Data Handling
**Rules**:
- ❌ **NEVER** use real Azure credentials in automated tests
- ❌ **NEVER** use real user tokens
- ❌ **NEVER** commit Azure credentials to git
- ✅ **ALWAYS** use mocked MSAL responses
- ✅ **ALWAYS** use environment variables for real Azure testing
- ✅ **ALWAYS** use test Azure tenant for manual E2E tests
**Test Azure Configuration** (for manual testing only):
```bash
# .env.test (NOT committed to git)
AZURE_CLIENT_ID=test-app-client-id
AZURE_TENANT_ID=test-tenant-id
CHECKLIST_DIR=./tests/fixtures/checklists
PROCESS_DIR=./tests/fixtures/processes
```
---
## 7. Test Environments
### 7.1 Environment Configurations
| Environment | Purpose | Configuration | External Services |
|-------------|---------|---------------|-------------------|
| **Local** | Development testing | localhost:8080 | Azure mocked |
| **CI** | Automated test runs | Ephemeral port | Azure mocked |
| **Manual QA** | Human validation | localhost:8080 | Real Azure test tenant |
### 7.2 Environment Setup
**Local Development Environment**:
```bash
# Clone repository
git clone <repository-url>
cd sso-mcp-server
# Install dependencies
uv sync
# Create test fixtures
mkdir -p tests/fixtures/checklists tests/fixtures/processes
cat > tests/fixtures/checklists/coding.md << 'EOF'
---
name: Coding Standards
description: Test checklist
---
# Coding Standards
- [ ] Test item
EOF
cat > tests/fixtures/processes/code-review.md << 'EOF'
---
name: Code Review Process
description: Test process
---
# Code Review Process
- [ ] Check tests pass
EOF
# Run E2E tests with mocked Azure
uv run pytest tests/e2e/ -v
```
**CI Environment**:
```yaml
# .github/workflows/e2e-tests.yml
- name: Run E2E tests
env:
AZURE_CLIENT_ID: mock-client-id
AZURE_TENANT_ID: mock-tenant-id
CHECKLIST_DIR: ./tests/fixtures/checklists
PROCESS_DIR: ./tests/fixtures/processes
MCP_PORT: 8080
run: uv run pytest tests/e2e/ -v --timeout=60
```
### 7.3 Service Mocking Strategy
**What to Mock**:
- Azure Entra ID OAuth endpoints
- Browser interactions (webbrowser.open)
- System keychain (msal-extensions encryption)
**Mocking Approach**:
```python
# tests/e2e/conftest.py
import pytest
from unittest.mock import patch, MagicMock
@pytest.fixture
def mock_msal():
"""Mock MSAL PublicClientApplication for automated E2E tests."""
with patch("msal.PublicClientApplication") as mock_app:
instance = MagicMock()
instance.get_accounts.return_value = []
instance.acquire_token_interactive.return_value = {
"access_token": "mock-access-token",
"refresh_token": "mock-refresh-token",
"expires_in": 3600,
"token_type": "Bearer",
}
instance.acquire_token_silent.return_value = None
mock_app.return_value = instance
yield instance
@pytest.fixture
def mock_browser():
"""Mock browser opening for automated tests."""
with patch("webbrowser.open") as mock_open:
yield mock_open
```
### 7.4 Infrastructure Requirements
**Compute** (for CI):
- CPU: 2 cores minimum
- Memory: 4GB RAM minimum
- Storage: 10GB
**Dependencies**:
- Python 3.11+
- uv package manager
- pytest, pytest-asyncio
**Network**:
- No external network required (all mocked)
- For manual testing: Internet access to Azure Entra ID
---
## 8. Test Framework & Tools
### 8.1 Framework Selection
**Primary Testing Framework**: pytest + pytest-asyncio
**Justification**:
- ✅ Native Python integration (matches project stack)
- ✅ Excellent async support via pytest-asyncio
- ✅ Rich fixture system for test isolation
- ✅ Extensive plugin ecosystem
- ✅ Already used for unit/integration tests (consistency)
**Alternative Considered**: Robot Framework
- ❌ Why not chosen: Over-engineered for CLI application, team unfamiliar
### 8.2 Tool Stack
| Category | Tool | Purpose | Version |
|----------|------|---------|---------|
| **Test Runner** | pytest | Test execution | ^8.0.0 |
| **Async Support** | pytest-asyncio | Async test support | ^0.23.0 |
| **HTTP Testing** | httpx | MCP HTTP requests | ^0.27.0 |
| **Mocking** | unittest.mock | Service mocking | stdlib |
| **Time Mocking** | freezegun | Token expiry tests | ^1.2.0 |
| **Coverage** | pytest-cov | Coverage reporting | ^4.0.0 |
| **Timeouts** | pytest-timeout | Prevent hanging tests | ^2.3.0 |
### 8.3 Framework Configuration
**pytest Configuration**:
```toml
# pyproject.toml
[tool.pytest.ini_options]
asyncio_mode = "auto"
testpaths = ["tests"]
addopts = "-v --tb=short --timeout=60"
markers = [
"smoke: Quick critical path tests",
"regression: Full regression suite",
"e2e: End-to-end integration tests",
"checklist: Checklist-related tests",
"process: Process-related tests",
"search: Search functionality tests",
]
[tool.pytest-timeout]
timeout = 60
method = "thread"
```
### 8.4 Custom Utilities
**MCP Client Helper**:
```python
# tests/e2e/helpers/mcp_client.py
import httpx
from typing import Any
class MCPTestClient:
"""Test client for MCP HTTP requests."""
def __init__(self, base_url: str = "http://localhost:8080"):
self.base_url = base_url
self.client = httpx.AsyncClient()
async def call_tool(self, tool_name: str, arguments: dict[str, Any] = None) -> dict:
"""Call MCP tool and return response."""
payload = {
"jsonrpc": "2.0",
"method": "tools/call",
"params": {
"name": tool_name,
"arguments": arguments or {}
},
"id": 1
}
response = await self.client.post(f"{self.base_url}/mcp", json=payload)
return response.json()
async def get_checklist(self, name: str) -> dict:
"""Convenience method for get_checklist tool."""
return await self.call_tool("get_checklist", {"name": name})
async def list_checklists(self) -> dict:
"""Convenience method for list_checklists tool."""
return await self.call_tool("list_checklists")
async def get_process(self, name: str) -> dict:
"""Convenience method for get_process tool."""
return await self.call_tool("get_process", {"name": name})
async def list_processes(self) -> dict:
"""Convenience method for list_processes tool."""
return await self.call_tool("list_processes")
async def search_processes(self, keyword: str) -> dict:
"""Convenience method for search_processes tool."""
return await self.call_tool("search_processes", {"keyword": keyword})
```
---
## 9. Test Architecture
### 9.1 Design Pattern
**Pattern**: Fixture-based with Helper Modules
**Structure**:
```
tests/
├── e2e/
│ ├── __init__.py
│ ├── conftest.py # Shared E2E fixtures
│ ├── helpers/
│ │ ├── __init__.py
│ │ ├── mcp_client.py # MCP HTTP client
│ │ ├── server.py # Server lifecycle management
│ │ └── fixtures.py # Test data generators
│ ├── scenarios/
│ │ ├── __init__.py
│ │ ├── test_authentication.py # Auth flow scenarios
│ │ ├── test_get_checklist.py # Get checklist scenarios
│ │ ├── test_list_checklists.py # List checklists scenarios
│ │ ├── test_get_process.py # Get process scenarios
│ │ ├── test_list_processes.py # List processes scenarios
│ │ ├── test_search_processes.py # Search processes scenarios
│ │ └── test_smoke.py # Smoke test suite
│ └── fixtures/
│ ├── checklists/
│ │ ├── coding.md
│ │ ├── architecture.md
│ │ └── detailed-design.md
│ └── processes/
│ ├── code-review.md
│ ├── deployment.md
│ └── incident-response.md
```
### 9.2 Test Example - Checklists
```python
# tests/e2e/scenarios/test_get_checklist.py
import pytest
from tests.e2e.helpers.mcp_client import MCPTestClient
from tests.e2e.helpers.server import start_test_server
@pytest.mark.e2e
@pytest.mark.smoke
@pytest.mark.checklist
async def test_get_checklist_returns_content(
mock_msal,
temp_checklist_dir,
temp_token_cache,
):
"""Scenario 5.3.1: Get checklist with valid name."""
# ARRANGE
async with start_test_server(
checklist_dir=temp_checklist_dir,
token_cache=temp_token_cache,
) as server_url:
client = MCPTestClient(server_url)
# ACT
response = await client.get_checklist("coding")
# ASSERT
assert "error" not in response
result = response["result"]
assert result["name"] == "coding"
assert result["description"] is not None
assert "# Coding Standards" in result["content"]
@pytest.mark.e2e
@pytest.mark.smoke
@pytest.mark.checklist
async def test_get_checklist_not_found_returns_error(
mock_msal,
temp_checklist_dir,
temp_token_cache,
):
"""Scenario 5.3.2: Get checklist with invalid name."""
async with start_test_server(
checklist_dir=temp_checklist_dir,
token_cache=temp_token_cache,
) as server_url:
client = MCPTestClient(server_url)
response = await client.get_checklist("nonexistent")
assert "error" in response
assert response["error"]["code"] == "CHECKLIST_NOT_FOUND"
assert "nonexistent" in response["error"]["message"]
```
### 9.3 Test Example - Processes
```python
# tests/e2e/scenarios/test_get_process.py
import pytest
from tests.e2e.helpers.mcp_client import MCPTestClient
from tests.e2e.helpers.server import start_test_server
@pytest.mark.e2e
@pytest.mark.smoke
@pytest.mark.process
async def test_get_process_returns_content(
mock_msal,
temp_checklist_dir,
temp_process_dir,
temp_token_cache,
):
"""Scenario 5.4.1: Get process with valid name."""
# ARRANGE
async with start_test_server(
checklist_dir=temp_checklist_dir,
process_dir=temp_process_dir,
token_cache=temp_token_cache,
) as server_url:
client = MCPTestClient(server_url)
# ACT
response = await client.get_process("code-review")
# ASSERT
assert "error" not in response
result = response["result"]
assert result["name"] == "code-review"
assert result["description"] is not None
assert "Code Review" in result["content"]
@pytest.mark.e2e
@pytest.mark.smoke
@pytest.mark.process
async def test_get_process_case_insensitive(
mock_msal,
temp_checklist_dir,
temp_process_dir,
temp_token_cache,
):
"""Scenario 5.4.2: Get process with case-insensitive matching."""
async with start_test_server(
checklist_dir=temp_checklist_dir,
process_dir=temp_process_dir,
token_cache=temp_token_cache,
) as server_url:
client = MCPTestClient(server_url)
# Request with different case
response = await client.get_process("CODE-REVIEW")
assert "error" not in response
result = response["result"]
assert "code-review" in result["name"].lower()
```
### 9.4 Test Example - Process Search
```python
# tests/e2e/scenarios/test_search_processes.py
import pytest
from tests.e2e.helpers.mcp_client import MCPTestClient
from tests.e2e.helpers.server import start_test_server
@pytest.mark.e2e
@pytest.mark.regression
@pytest.mark.process
@pytest.mark.search
async def test_search_processes_returns_matches(
mock_msal,
temp_checklist_dir,
temp_process_dir,
temp_token_cache,
):
"""Scenario 5.5.5: Search processes with keyword match."""
async with start_test_server(
checklist_dir=temp_checklist_dir,
process_dir=temp_process_dir,
token_cache=temp_token_cache,
) as server_url:
client = MCPTestClient(server_url)
response = await client.search_processes("review")
assert "error" not in response
result = response["result"]
assert len(result["results"]) > 0
# code-review should be in results
names = [r["name"] for r in result["results"]]
assert any("review" in name.lower() for name in names)
@pytest.mark.e2e
@pytest.mark.regression
@pytest.mark.process
@pytest.mark.search
async def test_search_processes_no_matches(
mock_msal,
temp_checklist_dir,
temp_process_dir,
temp_token_cache,
):
"""Scenario 5.5.6: Search processes with no matches."""
async with start_test_server(
checklist_dir=temp_checklist_dir,
process_dir=temp_process_dir,
token_cache=temp_token_cache,
) as server_url:
client = MCPTestClient(server_url)
response = await client.search_processes("xyznonexistent")
assert "error" not in response
result = response["result"]
assert len(result["results"]) == 0
```
### 9.5 Test Organization
**Naming Conventions**:
- Test files: `test_*.py`
- Test functions: `test_<scenario>_<condition>_<expected>`
- Fixtures: `<resource>_fixture` or `temp_<resource>`
- Helpers: `<action>_<resource>.py`
**Test Tags**:
```python
@pytest.mark.smoke # Quick critical path tests
@pytest.mark.regression # Full regression suite
@pytest.mark.e2e # All E2E tests
@pytest.mark.slow # Tests >10 seconds
@pytest.mark.checklist # Checklist-related tests
@pytest.mark.process # Process-related tests
@pytest.mark.search # Search functionality tests
```
---
## 10. Execution Plan
### 10.1 Test Suites
| Suite | Purpose | Scenarios | Duration | Frequency |
|-------|---------|-----------|----------|-----------|
| **Smoke** | Critical paths | 5.2.1-5.4.4 (P0) | ~3 min | Every commit |
| **Regression** | Core functionality | All P0 + P1 | ~15 min | Daily |
| **Full** | Comprehensive | All scenarios | ~45 min | Weekly |
| **Session** | 8-hour validation | SC-003 with mocked time | ~5 min | Pre-release |
### 10.2 Execution Schedule
```mermaid
gantt
title E2E Test Execution Schedule
dateFormat HH:mm
section Per Commit
Smoke Tests :00:00, 3m
section Daily
Regression Tests :02:00, 15m
section Weekly
Full Suite :22:00, 45m
```
### 10.3 CI/CD Integration
**GitHub Actions**:
```yaml
name: E2E Tests
on:
push:
branches: [main, 001-mcp-sso-checklist, 003-process-query]
pull_request:
branches: [main]
schedule:
- cron: '0 2 * * *' # Daily at 2 AM
jobs:
e2e-smoke:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v1
- name: Install dependencies
run: uv sync --dev
- name: Run smoke tests
run: uv run pytest tests/e2e/ -m smoke -v --timeout=120
- uses: actions/upload-artifact@v4
if: always()
with:
name: e2e-test-results
path: test-results/
e2e-regression:
runs-on: ubuntu-latest
if: github.event_name == 'schedule' || github.event_name == 'workflow_dispatch'
steps:
- uses: actions/checkout@v4
- uses: astral-sh/setup-uv@v1
- name: Install dependencies
run: uv sync --dev
- name: Run regression tests
run: uv run pytest tests/e2e/ -m regression -v --timeout=600
```
### 10.4 Execution Commands
```bash
# Smoke tests (every commit)
uv run pytest tests/e2e/ -m smoke -v
# Regression tests (daily)
uv run pytest tests/e2e/ -m regression -v
# Full suite (weekly)
uv run pytest tests/e2e/ -v
# Checklist tests only
uv run pytest tests/e2e/ -m checklist -v
# Process tests only
uv run pytest tests/e2e/ -m process -v
# Search tests only
uv run pytest tests/e2e/ -m search -v
# 8-hour session test
uv run pytest tests/e2e/scenarios/test_session.py -v
# With coverage
uv run pytest tests/e2e/ -v --cov=src/sso_mcp_server --cov-report=html
```
---
## 11. Reporting & Metrics
### 11.1 Test Reports
**Report Types**:
- **Console Output**: pytest verbose output for CI
- **JUnit XML**: For CI integration (`--junitxml=test-results/results.xml`)
- **HTML Report**: For detailed local review (`--html=test-results/report.html`)
- **Coverage Report**: Code coverage visualization
**Report Generation**:
```bash
# Generate all reports
uv run pytest tests/e2e/ -v \
--junitxml=test-results/results.xml \
--html=test-results/report.html \
--cov=src/sso_mcp_server \
--cov-report=html:test-results/coverage
```
### 11.2 Key Metrics
| Metric | Definition | Target |
|--------|------------|--------|
| **Pass Rate** | (Passed / Total) × 100 | ≥ 95% |
| **Smoke Pass Rate** | Smoke tests passing | 100% |
| **Execution Time** | Time to run full suite | ≤ 45 min |
| **Flaky Rate** | Intermittent failures | ≤ 2% |
### 11.3 Failure Notifications
**CI Failure Handling**:
- PR blocked on smoke test failure
- GitHub Actions shows failure details
- Test artifacts uploaded for debugging
---
## 12. Maintenance & Improvement
### 12.1 Flaky Test Management
**Identification**:
- Track test history via CI
- Monitor tests with >1 retry
**Resolution**:
1. Investigate timing issues (add explicit waits)
2. Improve test isolation (ensure cleanup)
3. Document and track in backlog
4. Quarantine if not quickly fixable
### 12.2 Test Maintenance Schedule
| Task | Frequency | Owner |
|------|-----------|-------|
| Review failed tests | Per failure | Developer |
| Update test fixtures | When spec changes | Developer |
| Review test coverage | Monthly | Team |
| Update E2E test plan | Per feature release | Team Lead |
### 12.3 Continuous Improvement
**Review Process**:
1. **Per PR**: Ensure new features have E2E coverage
2. **Monthly**: Analyze flaky tests and failures
3. **Quarterly**: Evaluate test strategy effectiveness
---
## 13. Appendices
### 13.1 Glossary
| Term | Definition |
|------|------------|
| **E2E Test** | End-to-end test validating entire system workflow |
| **MCP** | Model Context Protocol for AI assistant tool integration |
| **PKCE** | Proof Key for Code Exchange (OAuth 2.0 security extension) |
| **Smoke Test** | Quick test of critical functionality |
| **Fixture** | Fixed test data used for testing |
| **Process** | Development procedure document (code-review, deployment, etc.) |
| **Checklist** | Quality standard document with verification items |
### 13.2 References
- **Architecture**: `docs/architecture.md`
- **Standards**: `docs/standards.md`
- **Ground Rules**: `memory/ground-rules.md`
- **Feature Specs**:
- `specs/001-mcp-sso-checklist/spec.md` - Checklist feature
- `specs/003-process-query/spec.md` - Process Query feature
- **MCP Tool Contracts**: `specs/001-mcp-sso-checklist/contracts/mcp-tools.json`
### 13.3 Test Scenario Catalog
**Scenario Coverage by Priority**:
| Priority | Scenarios | Coverage |
|----------|-----------|----------|
| P0 (Critical) | 9 | Auth + Checklist Tools + Process Tools |
| P1 (High) | 7 | Session + Dynamic Discovery + Search |
| P2 (Medium) | 9 | Configuration + Edge Cases |
| P3 (Low) | 3 | Rare Edge Cases |
| **Total** | **28** | **100%** |
**Scenario Coverage by Feature**:
| Feature | Scenarios | Priority Range |
|---------|-----------|----------------|
| Authentication | 5.2.1, 5.2.2, 5.5.1, 5.5.2 | P0-P1 |
| Checklists (001) | 5.3.1-5.3.3, 5.5.3, 5.8.1 | P0-P1 |
| Processes (003) | 5.4.1-5.4.4, 5.5.4-5.5.7, 5.8.2-5.8.3 | P0-P1 |
| Configuration | 5.6.1-5.6.9, 5.7.1-5.7.4 | P2-P3 |
### 13.4 Success Criteria Mapping
**Checklist Feature (001-mcp-sso-checklist)**:
| Success Criteria | E2E Scenario |
|------------------|--------------|
| SC-001: Auth <30s | 5.8.1: Full Workflow |
| SC-002: Retrieval <2s | 5.3.1: Get Checklist Valid |
| SC-003: 8-hour session | 5.5.2: Token Refresh |
| SC-005: List all checklists | 5.3.3: List Checklists |
| SC-006: Actionable errors | 5.3.2, 5.7.1, 5.7.2 |
| SC-007: Server start <5s | 5.2.1, 5.2.2 |
**Process Feature (003-process-query)**:
| Success Criteria | E2E Scenario |
|------------------|--------------|
| SC-001: Process retrieval <2s | 5.4.1: Get Process Valid |
| SC-002: Process listing <1s | 5.4.4: List Processes |
| SC-003: Search <3s | 5.5.5: Search Processes |
| SC-005: List all processes | 5.4.4: List Processes |
| SC-006: Search returns matches | 5.5.5: Search with Matches |
| SC-007: Actionable errors | 5.4.3: Process Not Found |
---
**END OF E2E TEST PLAN DOCUMENT**
---
## Maintenance Notes
This document should be:
- **Reviewed monthly** by development team
- **Updated** when new features are added
- **Referenced** when implementing E2E tests
- **Shared** with QA for test validation
- **Version controlled** alongside test code
For questions or suggestions, contact the Development Team.