---
applyTo: "tests/**/*.cs"
---
# Testing Strategy - Quick Reference
> **Two tiers: Integration (Excel) → OnDemand (session cleanup)**
>
> **⚠️ No Unit Tests**: ExcelMcp has no traditional unit tests. Integration tests ARE our unit tests because Excel COM cannot be meaningfully mocked. See `docs/ADR-001-NO-UNIT-TESTS.md` for full rationale.
## Test Naming Standard
**Pattern**: `MethodName_StateUnderTest_ExpectedBehavior`
- **MethodName**: Command being tested (no "Async" suffix)
- **StateUnderTest**: Specific scenario/condition (not generic like "Valid")
- **ExpectedBehavior**: Clear outcome (Returns*, Creates*, Removes*, etc.)
**Examples**:
```csharp
✅ List_EmptyWorkbook_ReturnsEmptyList
✅ Create_UniqueName_ReturnsSuccess
✅ Delete_NonActiveSheet_ReturnsSuccess
✅ ImportThenDelete_UniqueQuery_RemovedFromList
❌ List_WithValidFile_ReturnsSuccessResult // Too generic
❌ CreateAsync_ValidName_Success // Has Async suffix
❌ Delete // Missing state and behavior
```
**Full Standard**: See `docs/TEST-NAMING-STANDARD.md` for complete guide with pattern catalog and examples.
## Test Class Templates
### Integration Test Template
```csharp
[Trait("Category", "Integration")]
[Trait("Speed", "Medium")]
[Trait("Layer", "Core")]
[Trait("Feature", "FeatureName")] // PowerQuery, DataModel, Tables, PivotTables, Ranges, Connections, Parameters, Worksheets
[Trait("RequiresExcel", "true")]
public partial class FeatureCommandsTests : IClassFixture<TempDirectoryFixture>
{
private readonly IFeatureCommands _commands;
private readonly string _tempDir;
public FeatureCommandsTests(TempDirectoryFixture fixture)
{
_commands = new FeatureCommands();
_tempDir = fixture.TempDir;
}
[Fact]
public async Task Operation_Scenario_ExpectedResult()
{
// Arrange - Each test gets unique file
var testFile = CoreTestHelper.CreateUniqueTestFile(
nameof(FeatureCommandsTests),
nameof(Operation_Scenario_ExpectedResult),
_tempDir,
".xlsx"); // Use ".xlsm" for VBA tests
// Act
await using var batch = await ExcelSession.BeginBatchAsync(testFile);
var result = await _commands.OperationAsync(batch, args);
// Assert - Verify actual Excel state, not just success flag
Assert.True(result.Success, $"Operation failed: {result.ErrorMessage}");
// Verify object exists/updated in Excel (round-trip validation)
var verifyResult = await _commands.ListAsync(batch);
Assert.Contains(verifyResult.Items, i => i.Name == "Expected");
// No Save unless testing persistence (see examples below)
}
}
```
## Essential Rules
### File Isolation
- ✅ Each test creates unique file via `CoreTestHelper.CreateUniqueTestFile()`
- ❌ **NEVER** share test files between tests
- ✅ Use `.xlsm` for VBA tests, `.xlsx` otherwise
### Assertions
- ✅ Binary assertions: `Assert.True(result.Success, $"Reason: {result.ErrorMessage}")`
- ❌ **NEVER** "accept both" patterns
- ✅ **ALWAYS verify actual Excel state** after create/update operations
### Diagnostic Output
- ✅ Use `ILogger` with `ITestOutputHelper` for diagnostic messages
- ✅ Pattern: Test constructor receives `ITestOutputHelper`, creates logger via `MartinCostello.Logging.XUnit`
- ✅ Pass logger to `ExcelBatch` constructor (requires `InternalsVisibleTo` for accessing internal ExcelBatch)
- ✅ Logger messages appear in test output automatically (success or failure)
- ❌ **NEVER** use `Console.WriteLine()` - output is suppressed by test runner
- ❌ **NEVER** use `Debug.WriteLine()` - only visible with debugger attached, not in test output
- ❌ **NEVER** write to files for diagnostics - use proper logging infrastructure
### Save
- ❌ **FORBIDDEN** unless explicitly testing persistence
- ✅ **ONLY** for round-trip tests: Create → Save → Re-open → Verify
- ❌ **NEVER** call in middle of test (breaks subsequent operations)
- See CRITICAL-RULES.md Rule 14 for details
**Examples:**
```csharp
// ❌ WRONG: Save in middle breaks next operation
await _commands.CreateAsync(batch, "Sheet1");
await batch.Save(); // ❌ Breaks subsequent operations!
await _commands.RenameAsync(batch, "Sheet1", "New"); // FAILS!
// ✅ CORRECT: Save only at end
await _commands.CreateAsync(batch, "Sheet1");
await _commands.RenameAsync(batch, "Sheet1", "New");
await batch.Save(); // ✅ After all operations
// ✅ CORRECT: Persistence test with re-open
await using var batch1 = await ExcelSession.BeginBatchAsync(testFile);
await _commands.CreateAsync(batch1, "Sheet1");
await batch1.Save(); // Save for persistence
await using var batch2 = await ExcelSession.BeginBatchAsync(testFile);
var list = await _commands.ListAsync(batch2);
Assert.Contains(list.Items, i => i.Name == "Sheet1"); // ✅ Verify persisted
```
### Batch Pattern
- All Core commands accept `IExcelBatch batch` as first parameter
- Use `await using var batch` for automatic disposal
- **NEVER** call `Save()` in middle of test
### Required Traits
- `[Trait("Category", "Integration")]` - All tests are integration tests
- `[Trait("Speed", "Medium|Slow")]`
- `[Trait("Layer", "Core|CLI|McpServer|ComInterop|Diagnostics")]`
- `[Trait("Feature", "<feature-name>")]` - See valid values below
- `[Trait("RequiresExcel", "true")]` - All integration tests require Excel
- `[Trait("RunType", "OnDemand")]` - For session/lifecycle tests and diagnostic tests (slow, run only when explicitly requested)
### Valid Feature Values
- **PowerQuery** - Power Query M code operations
- **DataModel** - Data Model / DAX operations
- **Tables** - Excel Table (ListObject) operations
- **PivotTables** - PivotTable operations
- **Ranges** - Range data operations
- **Connections** - Connection management
- **Parameters** - Named range parameters
- **Worksheets** - Worksheet lifecycle
- **VBA** - VBA script operations
- **VBATrust** - VBA trust detection/configuration
## Test Execution
**⚠️ CRITICAL: Always specify the test project explicitly to avoid running all test projects!**
### Core.Tests (Business Logic)
```bash
# Development (fast - excludes VBA)
dotnet test tests/ExcelMcp.Core.Tests/ExcelMcp.Core.Tests.csproj --filter "Category=Integration&RunType!=OnDemand&Feature!=VBA&Feature!=VBATrust"
# Diagnostic tests (validate patterns, slow ~20s each)
dotnet test tests/ExcelMcp.Core.Tests/ExcelMcp.Core.Tests.csproj --filter "RunType=OnDemand&Layer=Diagnostics"
# VBA tests (manual only - requires VBA trust)
dotnet test tests/ExcelMcp.Core.Tests/ExcelMcp.Core.Tests.csproj --filter "(Feature=VBA|Feature=VBATrust)&RunType!=OnDemand"
# Specific feature
dotnet test tests/ExcelMcp.Core.Tests/ExcelMcp.Core.Tests.csproj --filter "Feature=PowerQuery"
```
### ComInterop.Tests (Session/Batch Infrastructure)
```bash
# Session/batch changes (MANDATORY - see CRITICAL-RULES.md Rule 3)
dotnet test tests/ExcelMcp.ComInterop.Tests/ExcelMcp.ComInterop.Tests.csproj --filter "RunType=OnDemand"
```
### McpServer.Tests (End-to-End Tool Tests)
```bash
# All MCP tool tests
dotnet test tests/ExcelMcp.McpServer.Tests/ExcelMcp.McpServer.Tests.csproj
# Specific tool
dotnet test tests/ExcelMcp.McpServer.Tests/ExcelMcp.McpServer.Tests.csproj --filter "FullyQualifiedName~PowerQueryTool"
```
### CLI.Tests (Command-Line Interface)
```bash
# All CLI tests
dotnet test tests/ExcelMcp.CLI.Tests/ExcelMcp.CLI.Tests.csproj
# Specific command
dotnet test tests/ExcelMcp.CLI.Tests/ExcelMcp.CLI.Tests.csproj --filter "FullyQualifiedName~PowerQuery"
```
### Run Specific Test by Name
```bash
# Use full project path + filter
dotnet test tests/ExcelMcp.Core.Tests/ExcelMcp.Core.Tests.csproj --filter "FullyQualifiedName~TestMethodName"
```
## Round-Trip Validation Pattern
**Always verify actual Excel state after operations:**
```csharp
// ✅ CREATE → Verify exists
var createResult = await _commands.CreateAsync(batch, "TestTable");
Assert.True(createResult.Success);
var listResult = await _commands.ListAsync(batch);
Assert.Contains(listResult.Items, i => i.Name == "TestTable"); // ✅ Proves it exists!
// ✅ UPDATE → Verify changes applied
var updateResult = await _commands.RenameAsync(batch, "TestTable", "NewName");
Assert.True(updateResult.Success);
var viewResult = await _commands.GetAsync(batch, "NewName");
Assert.Equal("NewName", viewResult.Name); // ✅ Proves rename worked!
// ✅ DELETE → Verify removed
var deleteResult = await _commands.DeleteAsync(batch, "NewName");
Assert.True(deleteResult.Success);
var finalList = await _commands.ListAsync(batch);
Assert.DoesNotContain(finalList.Items, i => i.Name == "NewName"); // ✅ Proves deletion!
```
### Content Replacement Validation (CRITICAL)
**For operations that replace content (Update, Set, etc.), ALWAYS verify content was replaced, not merged/appended:**
```csharp
// ❌ WRONG: Only checks operation completed
var updateResult = await _commands.UpdateAsync(batch, queryName, newFile);
Assert.True(updateResult.Success); // Doesn't prove content was replaced!
// ✅ CORRECT: Verify content was replaced, not merged
var updateResult = await _commands.UpdateAsync(batch, queryName, newFile);
Assert.True(updateResult.Success);
var viewResult = await _commands.ViewAsync(batch, queryName);
Assert.Equal(expectedContent, viewResult.Content); // ✅ Content matches expected
Assert.DoesNotContain("OldContent", viewResult.Content); // ✅ Old content gone!
// ✅ EVEN BETTER: Test multiple sequential updates (exposes merging bugs)
await _commands.UpdateAsync(batch, queryName, file1);
await _commands.UpdateAsync(batch, queryName, file2);
var viewResult = await _commands.ViewAsync(batch, queryName);
Assert.Equal(file2Content, viewResult.Content); // ✅ Only file2 content present
Assert.DoesNotContain(file1Content, viewResult.Content); // ✅ file1 content gone!
```
**Why Critical:** Bug report showed that UpdateAsync was **merging** M code instead of replacing it. Tests passed because they only checked `Success = true`, not actual content. The bug compounded with each update, corrupting queries progressively worse.
**Lesson:** "Operation completed" ≠ "Operation did the right thing". Always verify the actual result.
## Common Mistakes
| Mistake | Fix |
|---------|-----|
| Shared test file | Each test creates unique file |
| Only test success flag | Verify actual Excel state |
| Save before assertions | Remove Save entirely |
| Save in middle of test | Only at end or in persistence test |
| Manual IDisposable | Use `IClassFixture<TempDirectoryFixture>` |
| .xlsx for VBA tests | Use `.xlsm` |
| "Accept both" assertions | Binary assertions only |
| Missing Feature trait | Add from valid feature list above |
## When Tests Fail
1. Run individually: `--filter "FullyQualifiedName=Namespace.Class.Method"`
2. Check file isolation (unique files?)
3. Check assertions (binary, not conditional?)
4. Check Save (removed unless persistence test?)
5. Verify Excel state (not just success flag?)
**Full checklist**: See CRITICAL-RULES.md Rule 12
---
## LLM Integration Tests
**Location**: `tests/ExcelMcp.McpServer.LLM.Tests/`
**Purpose**: Validate that AI agents correctly use Excel MCP Server tools using [agent-benchmark](https://github.com/mykhaliev/agent-benchmark).
### When to Run
- **Manual/on-demand only** - Not part of CI/CD
- After changing tool descriptions or adding new tools
- To validate LLM behavior patterns (e.g., incremental updates vs rebuild)
### Running LLM Tests
```powershell
# Navigate to the LLM tests directory first
cd d:\source\mcp-server-excel\tests\ExcelMcp.McpServer.LLM.Tests
# Run specific scenario (most common)
.\Run-LLMTests.ps1 -Scenario excel-powerquery-datamodel-test.yaml -Verbose
# Build MCP server first, then run tests
.\Run-LLMTests.ps1 -Scenario excel-file-worksheet-test.yaml -Build
# Run all scenarios in Scenarios folder
.\Run-LLMTests.ps1 -Build
```
**Important:** Always `cd` to the LLM tests directory first, then run `.\Run-LLMTests.ps1`.
### Prerequisites
- `AZURE_OPENAI_ENDPOINT` environment variable
- Windows desktop with Excel installed
- agent-benchmark (auto-downloaded on first run, or use go-run mode)
### Configuration
The test runner uses `llm-tests.config.json` for defaults. Create `llm-tests.config.local.json` (git-ignored) for personal settings:
```json
{
"$schema": "./llm-tests.config.schema.json",
"agentBenchmarkPath": "../../../../agent-benchmark",
"agentBenchmarkMode": "go-run",
"verbose": false,
"build": false
}
```
**Modes:**
- `"executable"` - Uses pre-built `agent-benchmark.exe`
- `"go-run"` - Uses `go run .` from agent-benchmark source (recommended for development)
### Template Variables
Test YAML files support these template variables (substituted by Run-LLMTests.ps1):
| Variable | Description |
|----------|-------------|
| `{{TEST_DIR}}` | Directory containing the test YAML file (Scenarios folder) |
| `{{TEST_RESULTS_PATH}}` | Path to TestResults folder for output files |
| `{{TEMP_DIR}}` | System temp directory |
| `{{SERVER_COMMAND}}` | Full path to MCP server executable |
| `{{AZURE_OPENAI_ENDPOINT}}` | Azure OpenAI endpoint from environment |
| `{{randomValue type='UUID'}}` | Generates unique UUID (handled by agent-benchmark) |
**Path Notes:**
- Use `{{TEST_DIR}}/../Fixtures/` to access test data files in Fixtures folder
- Use `{{TEST_RESULTS_PATH}}/filename.xlsx` to save Excel files alongside reports
### Test Scenarios
| Scenario | Tools | Purpose |
|----------|-------|---------|
| `excel-file-worksheet-test.yaml` | excel_file, excel_worksheet | File lifecycle, worksheet operations |
| `excel-range-test.yaml` | excel_range | 2D array format, get/set values |
| `excel-table-test.yaml` | excel_table | Table creation, structured data |
| `excel-powerquery-datamodel-test.yaml` | excel_powerquery, excel_datamodel, excel_pivottable | Full BI workflow |
### Writing LLM Tests
- Use natural language prompts (don't mention tool names)
- Consolidate multiple steps into single prompts for token optimization
- Each test uses `{{randomValue type='UUID'}}` in file paths for isolation
- Enable `clarification_detection: enabled: true` on agents
- Assert `no_clarification_questions` to verify agent doesn't ask questions
**Assertion Types:**
- `no_hallucinated_tools` - Agent only uses available tools
- `no_clarification_questions` - Agent acts without asking questions (requires clarification_detection)
- `tool_called` - Verifies specific tool was called
- `tool_param_equals` - Verifies tool parameter value
- `output_regex` - Matches pattern in final output
### Test Results
Reports are generated in `TestResults/`:
- `<scenario-name>-report.html` - Visual HTML report
- `<scenario-name>-report.json` - Machine-readable JSON
- `*.xlsx` - Generated Excel files (when using TEST_RESULTS_PATH)
---
## Agent-Benchmark Development
**Location**: `d:\source\agent-benchmark` (separate repository)
When developing agent-benchmark alongside mcp-server-excel tests:
### Running Agent-Benchmark Tests
```powershell
# Run all agent-benchmark tests
cd d:\source\agent-benchmark
go test ./... -v
# Run specific test
go test ./test -run "TestGenerateContentWithConfig" -v
# Run tests matching pattern
go test ./test -run "TestClarification" -v
```
### Building After Changes
```powershell
# Rebuild executable (if using executable mode)
cd d:\source\agent-benchmark
go build .
# Or just use go-run mode (auto-rebuilds)
# Set in llm-tests.config.local.json: "agentBenchmarkMode": "go-run"
```
### Key Test Files
| File | Purpose |
|------|---------|
| `test/agent_test.go` | Agent execution tests (GenerateContentWithConfig) |
| `test/model_test.go` | Assertion evaluator tests |
| `test/engine_test.go` | Test engine/orchestration tests |
| `test/httpclient_test.go` | HTTP client and rate limiting tests |
See `tests/ExcelMcp.McpServer.LLM.Tests/README.md` for complete documentation.