# Comprehensive Guide to CTE Unit Testing in dbt
## Table of Contents
1. [The Problem: Testing Complex SQL CTEs](#the-problem)
2. [The Solution: Automated CTE Test Generation](#the-solution)
3. [CTE Query Tool: Debug Models Step-by-Step](#cte-query-tool)
4. [Technical Implementation](#technical-implementation)
5. [Usage Guide](#usage-guide)
6. [Design Decisions](#design-decisions)
7. [Edge Cases and Robustness](#edge-cases)
8. [Benefits and Trade-offs](#benefits-tradeoffs)
9. [Proposal for dbt Standard](#proposal-for-dbt-standard)
---
## The Problem: Testing Complex SQL CTEs {#the-problem}
### Why Test CTEs?
Modern dbt models often contain multiple CTEs (Common Table Expressions) forming complex transformation pipelines:
```sql
with customers as (
select * from {{ ref('stg_customers') }}
),
orders as (
select * from {{ ref('stg_orders') }}
),
customer_agg as (
select
customer_id,
count(*) as order_count,
sum(amount) as total_amount
from orders
group by customer_id
),
final as (
select
c.customer_id,
c.first_name,
coalesce(ca.order_count, 0) as number_of_orders,
coalesce(ca.total_amount, 0) as lifetime_value
from customers c
left join customer_agg ca on c.customer_id = ca.customer_id
)
select * from final
```
### The Testing Challenge
**Problem 1: Black Box Testing**
Traditional dbt unit tests only test the final output. If a test fails, you don't know which CTE caused the failure.
**Problem 2: Complex Fixture Requirements**
To test the final output, you need fixtures for ALL upstream dependencies:
- `ref('stg_customers')`
- `ref('stg_orders')`
- Plus any other refs/sources in the entire chain
**Problem 3: Limited Test Isolation**
You can't easily test intermediate transformations (`customer_agg`) without running the entire model.
**Problem 4: Poor Developer Experience**
When refactoring a CTE deep in the chain, you have to:
1. Run the entire model
2. Debug which CTE broke
3. Add print statements or manually extract CTEs
4. Repeat until fixed
### Real-World Impact
In production dbt projects:
- Models with 5-10 CTEs are common
- Each CTE may represent complex business logic
- Bugs in intermediate CTEs are hard to debug
- Refactoring becomes risky without granular tests
- New team members struggle to understand complex models
---
## The Solution: Automated CTE Test Generation {#the-solution}
### Core Concept
**Idea**: Treat each CTE as a testable unit by automatically generating isolated test models.
**How it works**:
1. Developer writes a unit test targeting a specific CTE using `model: customers::customer_agg` syntax
2. Generator extracts everything UP TO that CTE (preserving all upstream dependencies)
3. Generator creates a new model that selects from just that CTE
4. Generator creates an enabled unit test for the isolated model
5. Tests run automatically alongside regular tests
### The `::` Convention
We use `::` syntax to denote CTE-level testing:
```yaml
unit_tests:
- name: test_customer_aggregation
model: customers::customer_agg # model_name::cte_name
config:
cte_test: true # Marks test for automatic generation
enabled: false # Disable original (runs via generated version)
given:
- input: ref('stg_orders')
rows:
- {customer_id: 1, order_id: 100, amount: 50}
- {customer_id: 1, order_id: 101, amount: 75}
expect:
rows:
- {customer_id: 1, order_count: 2, total_amount: 125}
```
### Generated Artifacts
**Generated Model** (`models/marts/__cte_tests/customers__customer_agg__abc123.sql`):
```sql
-- sqlfluff:disable
with customers as (
select * from {{ ref('stg_customers') }}
),
orders as (
select * from {{ ref('stg_orders') }}
),
customer_agg as (
select
customer_id,
count(*) as order_count,
sum(amount) as total_amount
from orders
group by customer_id
)
select * from customer_agg
```
**Generated Test** (`unit_tests/marts/__cte_tests/customers__customer_agg__abc123_unit_tests.yml`):
```yaml
version: 2
unit_tests:
- name: test_customer_aggregation
model: customers__customer_agg__abc123 # Points to generated model
# NOTE: No 'enabled: false' here - generated test runs normally
given:
- input: ref('stg_orders')
rows:
- {customer_id: 1, order_id: 100, amount: 50}
- {customer_id: 1, order_id: 101, amount: 75}
- input: ref('stg_customers') # Auto-added empty fixture
rows: []
expect:
rows:
- {customer_id: 1, order_count: 2, total_amount: 125}
```
### Workflow Integration
**Developer workflow**:
1. Write CTE test in original test file (disabled by `cte_test: true`)
2. Run `test_models` (with `EXPERIMENTAL_FEATURES=true`)
3. Generator automatically creates isolated test
4. All tests run together (regular + generated)
5. Clear feedback on which CTE failed
**No manual steps** - completely automated!
---
## Technical Implementation {#technical-implementation}
### Architecture Overview
```
┌─────────────────────────────────────┐
│ test_models tool invoked │
└──────────────┬──────────────────────┘
│
├─ If EXPERIMENTAL_FEATURES=true
│
v
┌─────────────────────────────────────┐
│ CTE Generator Workflow │
├─────────────────────────────────────┤
│ 1. Scan unit_tests/**/*.yml │
│ 2. Find tests with cte_test: true │
│ 3. Parse model::cte_name spec │
│ 4. Extract CTE from source model │
│ 5. Generate isolated model │
│ 6. Generate enabled test │
│ 7. Clean up old generated files │
└──────────────┬──────────────────────┘
│
v
┌─────────────────────────────────────┐
│ Run dbt test (includes generated) │
└─────────────────────────────────────┘
```
### Key Components
**1. SQL Parsing with Comment Awareness**
The generator handles:
- **SQL line comments**: `-- this is a comment`
- **SQL block comments**: `/* multi-line comment */`
- **Jinja comments**: `{# template comment #}`
- **Nested comments**: Block comments within block comments
- **Comments containing parens**: `/* count(distinct customer_id) */`
**Critical for robustness**: When finding CTEs or counting parentheses, comments are skipped:
```python
def is_position_in_comment(sql: str, pos: int) -> bool:
"""Check if a position in SQL is inside a comment."""
# Check line comment: is there '--' before pos on same line?
line_start = sql.rfind("\n", 0, pos) + 1
line_content = sql[line_start:pos]
if "--" in line_content:
return True
# Track SQL block comments (/* */) and Jinja comments ({# #})
block_comment_depth = 0
jinja_comment_depth = 0
i = 0
while i < pos:
if i + 1 < len(sql):
two_char = sql[i : i + 2]
if two_char == "/*":
block_comment_depth += 1
i += 2
continue
elif two_char == "*/":
block_comment_depth -= 1
i += 2
continue
elif two_char == "{#":
jinja_comment_depth += 1
i += 2
continue
elif two_char == "#}":
jinja_comment_depth -= 1
i += 2
continue
i += 1
return block_comment_depth > 0 or jinja_comment_depth > 0
```
**Why this matters**: During refactoring, developers often comment out old CTE versions:
```sql
-- Old version (commented out during refactoring)
-- customer_agg as (
-- select customer_id, count(*) as orders
-- from orders
-- group by customer_id
-- )
/* Trying different approach - keeping old version for reference
customer_agg as (
select customer_id, sum(amount) as total
from orders
group by customer_id
)
*/
{# Yet another version - testing different logic
customer_agg as (
select customer_id, avg(amount) as avg_order
from orders
group by customer_id
)
#}
-- Active version
customer_agg as (
select
customer_id,
count(*) as order_count,
sum(amount) as total_amount
from orders
group by customer_id
)
```
The generator finds ALL matches but **only uses the first non-commented version**.
**2. Paren Counting with String and Comment Awareness**
To extract a CTE, we need to find its matching closing parenthesis. But we must skip:
- Parens inside strings: `'Product (Large)'`
- Parens inside comments: `/* count(*) */`
- Both single and double quotes (MySQL/SQLite compatibility)
```python
# Find matching closing paren
paren_count = 1
i = paren_pos + 1
in_string = False
string_char = None
in_line_comment = False
in_block_comment = False
while i < len(sql) and paren_count > 0:
char = sql[i]
next_char = sql[i + 1] if i + 1 < len(sql) else ""
# Handle line comments: -- until newline
if not in_string and not in_block_comment and char == "-" and next_char == "-":
in_line_comment = True
i += 2
continue
if in_line_comment:
if char == "\n":
in_line_comment = False
i += 1
continue
# Handle block comments: /* until */
if not in_string and not in_line_comment and char == "/" and next_char == "*":
in_block_comment = True
i += 2
continue
if in_block_comment:
if char == "*" and next_char == "/":
in_block_comment = False
i += 2
else:
i += 1
continue
# Handle string literals (both single and double quotes)
if char in ('"', "'"):
if not in_string:
in_string = True
string_char = char
elif char == string_char:
in_string = False
string_char = None
# Count parens only outside strings and comments
if not in_string and not in_line_comment and not in_block_comment:
if char == "(":
paren_count += 1
elif char == ")":
paren_count -= 1
i += 1
```
**3. CTE Mocking with `::` Syntax**
**Problem**: When testing a downstream CTE, you may want to mock intermediate CTEs with specific data.
**Solution**: Use `::cte_name` in the `input` field:
```yaml
unit_tests:
- name: test_final_with_mocked_aggregation
model: customers::final
config:
cte_test: true
given:
- input: ::customer_agg # Mock this CTE
rows:
- {customer_id: 1, order_count: 5, total_amount: 500}
- input: ref('stg_customers')
rows:
- {customer_id: 1, first_name: 'Alice', last_name: 'Smith'}
expect:
rows:
- {customer_id: 1, first_name: 'Alice', number_of_orders: 5, lifetime_value: 500}
```
**Generator behavior**:
1. Extracts all CTEs up to `final`
2. Finds `customer_agg` CTE definition
3. Replaces its content with mocked fixture data
4. Preserves all other CTEs unchanged
**Generated model**:
```sql
with customers as (
select * from {{ ref('stg_customers') }}
),
orders as (
select * from {{ ref('stg_orders') }}
),
customer_agg AS (
SELECT 1 as customer_id, 5 as order_count, 500 as total_amount
),
final as (
select
c.customer_id,
c.first_name,
coalesce(ca.order_count, 0) as number_of_orders,
coalesce(ca.total_amount, 0) as lifetime_value
from customers c
left join customer_agg ca on c.customer_id = ca.customer_id
)
select * from final
```
**4. Auto-Fixture Generation**
**Problem**: The generated model may reference additional `ref()` or `source()` calls that weren't in the original test fixtures.
**Solution**: Generator scans the generated model and auto-adds empty fixtures:
```python
# Find all ref() calls
ref_pattern = r"ref\(['\"](\w+)['\"]\)"
refs = re.findall(ref_pattern, generated_sql)
# Find all source() calls
source_pattern = r"source\(['\"](\w+)['\"],\s*['\"](\w+)['\"]\)"
sources = re.findall(source_pattern, generated_sql)
# Auto-add missing fixtures
for ref_name in refs:
ref_input = f"ref('{ref_name}')"
if ref_input not in existing_inputs:
test["given"].append({"input": ref_input, "rows": []})
```
**5. Hash-Based Naming for Uniqueness**
Each generated test gets a unique hash suffix to avoid collisions:
```python
test_hash = hashlib.md5(test_name.encode()).hexdigest()[:6]
gen_model_name = f"{base_model}__{cte_name}__{test_hash}"
```
Example: `customers__customer_agg__21ff0f.sql`
**Why**: Multiple tests can target the same CTE with different fixtures. The hash ensures each gets a unique model.
**6. Jinja Preservation**
**Critical**: The generator preserves ALL Jinja syntax:
- Inside CTEs: `{{ ref('model') }}`, `{{ var('setting') }}`, `{{ source('src', 'tbl') }}`
- Wrapping CTEs: `{% if condition %} cte as (...) {% endif %}`
- Configuration: `{{ config(...) }}`
**Why it works**: dbt evaluates Jinja BEFORE the generator sees the SQL. Both the generator (running via dbt) and the tests (running via dbt) evaluate Jinja identically.
```sql
{% if true %}
orders as (
select * from {{ ref('stg_orders') }}
where status = '{{ var("order_status") }}'
),
{% endif %}
```
This works perfectly because:
1. Generator runs `dbt compile` to resolve Jinja
2. Tests run `dbt test` which also resolves Jinja
3. Same Jinja → Same SQL → Consistent behavior
### Data Type Handling
---
## CTE Query Tool: Debug Models Step-by-Step {#cte-query-tool}
### The Debugging Problem
When a dbt model fails or produces unexpected results, developers face a painful debugging workflow:
1. **Manual CTE extraction**: Copy-paste CTE definitions from the model file
2. **Dependency resolution**: Figure out which upstream CTEs are needed and extract those too
3. **Template resolution**: Manually replace `{{ ref() }}` and `{{ source() }}` with actual table names
4. **Query composition**: Wrap everything in a runnable query
5. **Result inspection**: Execute and analyze output
6. **Fixture creation**: If building unit tests, manually shape test data to match the CTE output
This process is error-prone, time-consuming, and breaks flow state. It's especially painful when:
- The CTE is deep in the transformation chain (requires many upstream CTEs)
- The model uses complex Jinja templating
- You're creating unit test fixtures and need realistic data shapes
### The Solution: `query_database` with CTE Extraction
The `query_database` tool can extract and query individual CTEs directly, handling all the complexity automatically:
```python
# Query a specific CTE with optional filtering
query_database(
cte_name="customer_agg",
model_name="customers",
sql="SELECT * FROM __cte__ WHERE order_count > 5 LIMIT 10"
)
```
**What happens**:
1. Extracts `customer_agg` CTE and all its upstream dependencies from `customers.sql`
2. Generates executable SQL: `WITH upstream_ctes..., customer_agg AS (...) SELECT * FROM customer_agg WHERE order_count > 5 LIMIT 10`
3. Runs the query through `dbt show` (resolves all `{{ ref() }}` and `{{ source() }}` automatically)
4. Returns the results formatted and ready to inspect
### Natural Language Workflow
**Via Copilot Chat**:
> **You**: "Show me the customer_agg CTE from customers with order_count > 5"
>
> **Copilot**: *Executes query and displays results*
> ```
> customer_id | order_count | total_amount
> ------------|-------------|-------------
> 123 | 10 | 1250.00
> 456 | 8 | 890.50
> ```
> **You**: "Great, now use that to shape my unit test fixture"
>
> **Copilot**: *Generates realistic test data based on the actual CTE output structure*
> ```yaml
> given:
> - input: ref('stg_orders')
> rows:
> - {customer_id: 123, order_id: 1, amount: 125.00}
> - {customer_id: 123, order_id: 2, amount: 125.00}
> # ... 8 more rows to match order_count: 10
> ```
### Key Features
**1. Automatic Dependency Resolution**
No need to manually figure out which CTEs are needed:
```sql
-- Your model has this chain:
with base_customers as (...),
enriched_customers as (select * from base_customers),
customer_agg as (select * from enriched_customers)
-- Just query the CTE you care about:
query_database(cte_name="customer_agg", model_name="customers", sql="SELECT * FROM __cte__")
-- Tool automatically includes base_customers and enriched_customers
```
**2. Template Resolution via dbt**
All Jinja templating is handled by dbt's own compilation:
```sql
-- Your CTE uses refs and sources:
customer_agg as (
select
c.customer_id,
count(o.order_id) as order_count
from {{ ref('stg_customers') }} c
left join {{ source('raw', 'orders') }} o
on c.customer_id = o.customer_id
group by c.customer_id
)
-- Tool resolves these automatically at query time - you don't need to know the actual table names
```
**3. Optional SQL Composition**
Apply filters, ordering, or limits without editing the model:
```python
# Just the CTE
query_database(cte_name="customer_agg", model_name="customers", sql="SELECT * FROM __cte__")
# With filtering
query_database(
cte_name="customer_agg",
model_name="customers",
sql="SELECT * FROM __cte__ WHERE order_count > 10"
)
# With sorting and limiting
query_database(
cte_name="customer_agg",
model_name="customers",
sql="SELECT * FROM __cte__ ORDER BY order_count DESC LIMIT 5"
)
# Complex filtering
query_database(
cte_name="customer_agg",
model_name="customers",
sql="SELECT * FROM __cte__ WHERE order_count BETWEEN 5 AND 10 ORDER BY total_amount DESC"
)
```
**4. Fast Feedback Loop**
- First query: ~2s (manifest load)
- Subsequent queries: < 1s (warm manifest)
- No need to run the entire model
- No need to modify the source file
### Use Cases
#### Use Case 1: Debug Data Quality Issues
```python
# Model produces wrong totals - which CTE is the problem?
# Check first transformation:
query_database(cte_name="base_orders", model_name="orders", sql="SELECT * FROM __cte__ LIMIT 5")
# ✓ Looks good
# Check aggregation step:
query_database(cte_name="order_agg", model_name="orders", sql="SELECT * FROM __cte__ LIMIT 5")
# ❌ Missing some orders - found the bug!
```
#### Use Case 2: Understand Complex Logic
```python
# New team member exploring a 10-CTE model
# Start at the beginning:
query_database(cte_name="filtered_events", model_name="user_analytics", sql="SELECT * FROM __cte__")
# Walk through the chain:
query_database(cte_name="sessionized_events", model_name="user_analytics", sql="SELECT * FROM __cte__")
query_database(cte_name="session_metrics", model_name="user_analytics", sql="SELECT * FROM __cte__")
query_database(cte_name="user_summary", model_name="user_analytics", sql="SELECT * FROM __cte__")
# Each step shows the transformation clearly
```
#### Use Case 3: Create Realistic Test Fixtures
```python
# Building unit tests for customer_agg CTE
# Query actual data to understand the shape:
query_database(
cte_name="customer_agg",
model_name="customers",
sql="SELECT * FROM __cte__ WHERE order_count IN (0, 1, 5, 10) LIMIT 20"
)
# Results show:
# - Customers with 0 orders (edge case)
# - Customers with 1 order (common case)
# - Customers with many orders (heavy user case)
# - Actual column names and data types
# Now write accurate fixtures:
# given:
# - input: ref('stg_orders')
# rows:
# - {customer_id: 1, order_id: 100, amount: 50.00} # Matches real schema
# - {customer_id: 1, order_id: 101, amount: 75.00}
```
#### Use Case 4: Validate Refactoring
```python
# Before refactoring customer_agg CTE, capture current output:
query_database(
cte_name="customer_agg",
model_name="customers",
sql="SELECT * FROM __cte__ ORDER BY customer_id LIMIT 100",
output_file="temp_auto/customer_agg_before.csv",
output_format="csv"
)
# Refactor the CTE...
# After refactoring, compare:
query_database(
cte_name="customer_agg",
model_name="customers",
sql="SELECT * FROM __cte__ ORDER BY customer_id LIMIT 100",
output_file="temp_auto/customer_agg_after.csv",
output_format="csv"
)
# Diff the CSVs - should be identical if refactoring was correct
```
### How It Works Under the Hood
The `query_database` tool uses the same CTE extraction logic as the unit test generator:
1. **Parse Model File**: Read the raw SQL from `models/{model_name}.sql`
2. **Extract Target CTE**: Find the CTE definition by name
3. **Trace Dependencies**: Recursively find all upstream CTEs referenced
4. **Generate Query**: Compose SQL with all dependencies and a final `SELECT * FROM {cte_name}`
5. **Apply User SQL**: Replace the final SELECT with your full query (e.g., `SELECT ... FROM __cte__ ...`)
6. **Execute via dbt**: Run through `dbt show --inline` (handles all template resolution)
7. **Return Results**: Parse JSON output and format for display
**Key Technical Details**:
- Uses the CTE generator from `cte_generator.py` (same as unit test generation)
- Parses raw SQL files (not compiled output) to extract CTEs
- Writes to temporary file (system temp dir to avoid dbt detecting it as a model)
- dbt's `show` command handles all `{{ ref() }}` and `{{ source() }}` resolution
- Supports all output formats: JSON (default), CSV, TSV
- Can save large results to files to avoid overwhelming the conversation
### Integration with Unit Testing Workflow
The CTE query tool complements the CTE unit test generator:
**Step 1: Explore with queries**
```python
# Understand what customer_agg produces:
query_database(cte_name="customer_agg", model_name="customers", sql="SELECT * FROM __cte__ LIMIT 10")
```
**Step 2: Write unit test based on real data**
```yaml
unit_tests:
- name: test_customer_aggregation
model: customers::customer_agg
config:
cte_test: true
enabled: false
given:
- input: ref('stg_orders')
rows: # Shaped from query results above
- {customer_id: 1, order_id: 100, amount: 50}
- {customer_id: 1, order_id: 101, amount: 75}
expect:
rows: # Expected output matches query structure
- {customer_id: 1, order_count: 2, total_amount: 125}
```
**Step 3: Run tests** (generator creates isolated test automatically)
```python
test_models()
```
**The workflow loop**:
- **Query** → Understand current behavior
- **Test** → Document expected behavior
- **Refactor** → Change implementation
- **Query again** → Verify changes
- **Test again** → Catch regressions
### Performance Characteristics
**Query Execution Time**:
- First query: ~2s (manifest load + query execution)
- Subsequent queries: < 1s (warm manifest, just query execution)
- Much faster than running entire model (only executes the requested CTE chain)
**When to Use**:
- ✅ Debugging specific transformation steps
- ✅ Understanding complex models
- ✅ Creating realistic test fixtures
- ✅ Validating refactoring changes
- ✅ Exploring data shapes for new models
**When NOT to Use**:
- ❌ Final model output (just use `query_database(sql="SELECT * FROM {{ ref('model') }}")`)
- ❌ Simple pass-through CTEs (not worth the overhead)
- ❌ Models without CTEs (tool requires CTE syntax)
### Comparison to Alternatives
| Approach | Manual Extraction | dbt compile + manual query | query_database tool |
|----------|------------------|---------------------------|---------------------|
| **Extract CTE** | ✋ Manual copy-paste | ✋ Manual copy-paste | ✅ Automatic |
| **Resolve dependencies** | ✋ Manual tracing | ✋ Manual tracing | ✅ Automatic |
| **Resolve templates** | ✋ Manual replacement | ✅ dbt compile | ✅ dbt show (automatic) |
| **Apply filters** | ✋ Edit SQL manually | ✋ Edit SQL manually | ✅ `sql` parameter |
| **Cleanup** | ✋ Manual | ⚠️ Leaves compiled files | ✅ Automatic temp file cleanup |
| **Time to result** | 5-10 minutes | 2-3 minutes | < 10 seconds |
### Error Handling
**Clear errors for common issues**:
```python
# CTE doesn't exist
query_database(cte_name="nonexistent", model_name="customers", sql="SELECT * FROM __cte__")
# Error: "CTE 'nonexistent' not found in model 'customers'"
# Model doesn't exist
query_database(cte_name="customer_agg", model_name="nonexistent", sql="SELECT * FROM __cte__")
# Error: "Model file 'nonexistent.sql' not found in models directory"
# Invalid SQL composition
query_database(cte_name="customer_agg", model_name="customers", sql="SELECT * FROM __cte__ WHERE")
# Error: "Query execution failed: [detailed dbt parser error]"
```
### Future Enhancements
**Potential additions**:
- **CTE dependency visualization**: Show which CTEs depend on which
- **Diff mode**: Compare CTE output before/after changes automatically
- **Sample data injection**: Override upstream CTEs with sample data (like unit test mocking)
- **Performance profiling**: Show execution time per CTE in the chain
---
## Technical Implementation {#technical-implementation}
### Data Type Handling
**CSV Parsing**:
```python
def parse_csv_fixture(csv_text: str) -> tuple[list[str], list[dict[str, Any]]]:
"""Parse CSV fixture into (columns, rows_as_dicts)."""
sio = StringIO(csv_text.strip("\n"))
reader = csv.DictReader(line for line in sio if line.strip() != "")
columns = list(reader.fieldnames) if reader.fieldnames else []
rows = [dict(row) for row in reader]
return columns, rows
```
**Numeric Detection**:
```python
if v.isdigit() or (v.replace(".", "", 1).replace("-", "", 1).isdigit()):
# It's a number - use as-is
exprs.append(f"{v} as {col}")
else:
# It's a string - escape and quote
escaped = v.replace("'", "''")
exprs.append(f"'{escaped}' as {col}")
```
**Handles**:
- Integers: `123`
- Decimals: `45.67`
- Negatives: `-10`
- Strings with quotes: `O'Brien` → `'O''Brien'`
- NULL values: `None` → `NULL`
### File Organization
```
examples/jaffle_shop/
├── models/
│ └── marts/
│ ├── customers.sql # Original model
│ └── __cte_tests/ # Generated models
│ ├── customers__customer_agg__21ff0f.sql
│ ├── customers__customer_agg__87f086.sql
│ └── customers__final__b810c3.sql
└── unit_tests/
└── marts/
├── customers_unit_tests.yml # Original tests (disabled)
└── __cte_tests/ # Generated tests (enabled)
├── customers__customer_agg__21ff0f_unit_tests.yml
├── customers__customer_agg__87f086_unit_tests.yml
└── customers__final__b810c3_unit_tests.yml
```
**Key decisions**:
- `__cte_tests/` subdirectory keeps generated files separate
- Cleaned up on every run (fresh generation)
- Hash suffixes prevent name collisions
- Mirror structure of original models
---
## Usage Guide {#usage-guide}
### Prerequisites
**CRITICAL**: This feature currently requires dbt-core-mcp MCP server. It does NOT work with standalone `dbt test` CLI commands.
### Basic Setup
**1. Install dbt-core-mcp** (if not already installed):
```bash
pip install dbt-core-mcp
```
**2. Enable experimental features** in MCP server config (`.vscode/mcp.json`):
```json
{
"servers": {
3 "dbt-core": {
"env": {
"EXPERIMENTAL_FEATURES": "true"
}
}
}
}
```
**2. Write a CTE test** (`unit_tests/marts/customers_unit_tests.yml`):
```yaml
version: 2
unit_tests:
- name: test_customer_aggregation
model: customers::customer_agg
config:
cte_test: true # Marks for generation
enabled: false # CRITICAL: Disable original test (runs via generated version)
given:
- input: ref('stg_orders')
rows:
- {customer_id: 1, order_id: 100, amount: 50}
- {customer_id: 1, order_id: 101, amount: 75}
expect:
rows:
- {customer_id: 1, order_count: 2, total_amount: 125}
```
**4. Run tests via MCP server**:
```python
# Via dbt-core-mcp MCP tool (ONLY way to trigger generation currently)
test_models()
```
**IMPORTANT**: Generation currently ONLY works through dbt-core-mcp MCP server. Running `dbt test` directly from CLI will NOT generate CTE tests - they'll simply be skipped (due to `enabled: false`).
**5. Tests run automatically** - no manual generation needed! The generator is integrated into the MCP server's `test_models` tool.
### Advanced Patterns
#### Pattern 1: Testing Edge Cases
```yaml
- name: test_customer_with_no_orders
model: customers::final
config:
enabled: false
cte_test: true
given:
- input: ref('stg_customers')
rows:
- {customer_id: 99, first_name: 'Newbie', last_name: 'User'}
- input: ref('stg_orders')
rows: [] # No orders
expect:
rows:
- {customer_id: 99, first_name: 'Newbie', number_of_orders: 0, lifetime_value: 0}
```
#### Pattern 2: Multiple CTE Mocks
```yaml
- name: test_final_with_multiple_mocks
model: customers::final
config:
enabled: false
cte_test: true
given:
- input: ::orders # Mock orders CTE
rows:
- {customer_id: 1, order_id: 100}
- input: ::customer_agg # Mock customer_agg CTE
rows:
- {customer_id: 1, order_count: 10, total_amount: 1000}
- input: ref('stg_customers')
rows:
- {customer_id: 1, first_name: 'Alice', last_name: 'Smith'}
expect:
rows:
- {customer_id: 1, first_name: 'Alice', number_of_orders: 10, lifetime_value: 1000}
```
#### Pattern 3: CSV Fixtures for Complex Data
```yaml
- name: test_customer_agg_with_csv
model: customers::customer_agg
config:
enabled: false
cte_test: true
given:
- input: ref('stg_orders')
format: csv
rows: |
customer_id,order_id,amount,status
1,100,50.00,completed
1,101,75.50,completed
2,102,100.00,cancelled
expect:
rows:
- {customer_id: 1, order_count: 2, total_amount: 125.50}
```
#### Pattern 4: Testing NULL Handling
```yaml
- name: test_null_amounts
model: customers::customer_agg
config:
cte_test: true
enabled: false
given:
- input: ref('stg_orders')
rows:
- {customer_id: 1, order_id: 100, amount: null}
- {customer_id: 1, order_id: 101, amount: 50}
expect:
rows:
- {customer_id: 1, order_count: 2, total_amount: 50}
```
### Testing Workflow
**Development cycle**:
1. Write model with multiple CTEs
2. Write CTE tests targeting intermediate transformations
3. Run `test_models` - generator creates isolated tests automatically
4. Tests fail? Debug specific CTE (not entire model)
5. Fix CTE, re-run tests
6. Refactor with confidence - granular tests catch regressions
**Debugging with CTE tests**:
```
❌ test_customer_aggregation FAILED
Expected: {customer_id: 1, order_count: 2}
Actual: {customer_id: 1, order_count: 1}
👉 Isolated to customer_agg CTE - check aggregation logic
```
vs traditional approach:
```
❌ test_customers_model FAILED
Expected: {customer_id: 1, number_of_orders: 2}
Actual: {customer_id: 1, number_of_orders: 1}
👉 Could be ANY CTE in the chain - manual debugging required
```
---
## Design Decisions {#design-decisions}
### 1. Why Automatic Generation vs Manual?
**Considered**: Requiring developers to manually create isolated CTE models.
**Rejected**: Too much boilerplate. Defeats the purpose of unit testing CTEs.
**Chosen**: Automatic generation on every `test_models` run.
**Rationale**:
- Zero manual work - just write the test
- Generated files are ephemeral (cleaned up each run)
- Always in sync with source models
- No risk of stale generated code
### 2. Why `::` Syntax?
**Considered alternatives**:
- `cte:customer_agg` (config field)
- `model: {name: customers, cte: customer_agg}` (nested structure)
- `model: customers/customer_agg` (path-like)
**Chosen**: `customers::customer_agg` (double colon)
**Rationale**:
- Familiar from other languages (C++, Rust namespacing)
- Visually distinct from file paths
- Works with existing dbt YAML schema (just a string)
- Parsing is trivial: `model_name, cte_name = spec.split("::")`
and `enabled: false`?
**Purpose**: Mark tests for generation vs regular tests, and prevent double execution.
**Two required config flags**:
```yaml
config:
cte_test: true # Marks test for CTE generation
enabled: false # Prevents original test from running
```
**Why both are needed**:
- `cte_test: true` - Generator detects and processes this test
- `enabled: false` - Prevents dbt from running the original (it runs via generated version)
- Without `enabled: false` - Test would run twice (original + generated)
**Alternatives considered**:
- Separate `cte_tests:` section in YAML
- Naming convention (`test_cte_*`)
- File location (`cte_tests/` directory)
**Chosen**: Config flags in existing unit test structure.
**Rationale**:
- Keeps all tests in one place
- Standard dbt YAML schema
- Easy to toggle on/off
- Clear intent in test definition
- Works with existing dbt test framework
- Clear intent in test definition
### 4. Why Clean Generated Files Each Run?
**Considered**: Persistent generated files (commit to git).
**Rejected**: Creates merge conflicts, stale code, confusion.
**Chosen**: Clean `__cte_tests/` on every generation.
**Rationale**:
- Generated files are artifacts, not source code
- Always fresh and in sync
- No merge conflicts
- Clear separation (underscore prefix)
- Fast regeneration (< 1 second for dozens of tests)
### 5. Why Auto-Add Missing Fixtures?
**Problem**: Generated model may ref additional dependencies not in original test.
**Example**:
config:
cte_test: true
enabled: false
```yaml
# Original test only provides stg_orders
given:
- input: ref('stg_orders')
rows: [...]
```
But generated model also has:
```sql
with customers as (
select * from {{ ref('stg_customers') }} -- Missing fixture!
)
```
**Solution**: Auto-add empty fixtures for missing refs/sources.
**Rationale**:
- Makes tests work out of the box
- Developer can override with real data if needed
- Empty fixtures often sufficient for isolated CTE testing
- Reduces boilerplate
### 6. Why Hash-Based Naming?
**Problem**: Multiple tests can target same CTE.
**Example**:
```yaml
- name: test_customer_agg_empty_orders
model: customers::customer_agg
config:
cte_test: true
enabled: false
given:
- input: ref('stg_orders')
rows: [] # Empty case
expect:
rows: []
- name: test_customer_agg_multiple_orders
model: customers::customer_agg
config:
cte_test: true
enabled: false
given:
- input: ref('stg_orders')
rows:
- {customer_id: 1, order_id: 100, amount: 50}
- {customer_id: 1, order_id: 101, amount: 75}
expect:
rows:
- {customer_id: 1, order_count: 2, total_amount: 125}
```
Both target `customer_agg` but need separate models.
**Solution**: Use hash of test name as suffix.
```
customers__customer_agg__21ff0f.sql # hash of test_customer_agg_empty_orders
customers__customer_agg__87f086.sql # hash of test_customer_agg_multiple_orders
```
**Rationale**:
- Guaranteed uniqueness
- Deterministic (same test → same hash → stable names)
- Short (6 characters sufficient)
- No name collisions
---
## Edge Cases and Robustness {#edge-cases}
### Commented-Out CTEs
**Scenario**: Multiple versions of a CTE exist (commented out during refactoring).
**Handling**:
```python
# Find ALL matches
matches = list(re.finditer(pattern, sql, re.IGNORECASE))
# Filter to first non-commented match
for m in matches:
if not is_position_in_comment(sql, m.start()):
match = m
break
```
**Result**: Only the active (non-commented) CTE is used.
### Comments Containing Special Characters
**Scenario**: Comments with parens, quotes, or other SQL syntax.
```sql
/* This counts distinct customers (important!) */
-- Status: 'pending' or 'completed'
{# TODO: Fix this aggregation #}
```
**Handling**: Comment detection happens BEFORE paren counting, so these are safely ignored.
### Jinja Conditional Blocks
**Scenario**: CTE with same name in multiple Jinja branches.
```sql
{% if var('use_advanced_logic') %}
customer_agg as (
select customer_id, sum(amount) as total
from orders
group by customer_id
)
{% else %}
customer_agg as (
select customer_id, count(*) as order_count
from orders
group by customer_id
)
{% endif %}
```
**Current Behavior**:
- Generator works with **raw Jinja/SQL** (Jinja NOT compiled yet)
- Finds **first non-commented match** of `customer_agg as (...)`
- Extracts/mocks only that first occurrence
- Generated model **still contains both Jinja branches** with original conditional logic
- At runtime, whichever branch is active determines what actually executes
**The Problem with CTE Mocking**:
If you mock `::customer_agg`:
```yaml
given:
- input: ::customer_agg
rows: [{customer_id: 1, total: 500}]
```
**What happens**:
1. First branch gets replaced with mock: `customer_agg as (SELECT 500 ...)`
2. Second branch remains unchanged: `customer_agg as (SELECT count(*) ...)`
3. At runtime:
- If `var('use_advanced_logic') = true` → Mock applies ✅
- If `var('use_advanced_logic') = false` → Mock DOESN'T apply ❌ (uses unmocked second branch)
**Current Limitation**: No way to target specific branch or mock all branches simultaneously.
**Recommendation**: Avoid having the same CTE name in multiple Jinja branches when using CTE mocking. Use unique names per branch:
```sql
{% if var('use_advanced_logic') %}
customer_agg_advanced as (...)
{% else %}
customer_agg_simple as (...)
{% endif %}
final as (
select * from
{% if var('use_advanced_logic') %}
customer_agg_advanced
{% else %}
customer_agg_simple
{% endif %}
)
```
**Future Enhancement**: Could support explicit branch targeting syntax like `::customer_agg@if` or `::customer_agg[2]`, or automatically mock all instances of a CTE name across branches.
### Deeply Nested CTEs
**Scenario**: Testing a CTE that depends on 5+ upstream CTEs.
**Example**:
```yaml
- name: test_final_cte
model: customers::final
config:
cte_test: true
enabled: false
given:
- input: ::orders
rows: [...]
- input: ::customer_agg
rows: [...]
- input: ::recent_completed
rows: [...]
```
**Handling**: Generator extracts entire upstream chain, applies all CTE mocks, generates complete model.
**Validated**: Test suite includes `test_enriched_final_cte` with 4 CTE mocks - works perfectly.
### Empty Fixtures
**For ref() with known schema** (auto-added by generator):
```yaml
given:
- input: ref('stg_customers')
rows: [] # Empty dict format works - generator knows schema from model
```
**For CTE mocks and source() with unknown schema**:
```yaml
# ❌ WRONG - Dict format doesn't specify columns
given:
- input: ::customer_agg
rows: []
# ✅ CORRECT - CSV format specifies columns even with no data
given:
- input: ::customer_agg
format: csv
rows: |
customer_id,order_count,total_amount
# No data rows, but columns are defined
```
**Generated SQL (from CSV format)**:
```sql
SELECT NULL as customer_id, NULL as order_count, NULL as total_amount WHERE 1=0
```
**Why this matters**:
- **ref() auto-fixtures**: Generator knows model schema, can add empty fixtures automatically
- **CTE mocks (`::cte_name`)**: No schema available - must use CSV format to specify columns
- **source() calls**: dbt doesn't know external schema - must use CSV format to specify columns
**Critical rule**: When mocking CTEs or sources, always use CSV format with column headers, even for empty data. The dict format only works when dbt can infer the schema (like with `ref()` to known models).
### String Escaping
**Scenario**: Fixture data contains single quotes.
```yaml
expect:
rows:
- {name: "O'Brien", company: "Pat's Bakery"}
```
**Generated SQL**:
```sql
SELECT 'O''Brien' as name, 'Pat''s Bakery' as company
```
**Handling**: Double single quotes (SQL standard escaping).
### Numeric vs String Detection
**Challenge**: CSV parser returns everything as strings.
**Solution**: Heuristic detection.
```python
if v.isdigit() or (v.replace(".", "", 1).replace("-", "", 1).isdigit()):
# Treat as number
exprs.append(f"{v} as {col}")
else:
# Treat as string
exprs.append(f"'{v}' as {col}")
```
**Handles**:
- `"123"` → `123` (integer)
- `"45.67"` → `45.67` (decimal)
- `"-10"` → `-10` (negative)
- `"not_a_number"` → `'not_a_number'` (string)
**Edge case**: Scientific notation like `"1e10"` treated as string (rare in CSV fixtures).
### CTE Name Conflicts
**Scenario**: Model has `customer_summary` and test targets `customer_sum` (substring match).
**Handling**: Use word boundary regex `\b{cte_name}\s+AS\s*\(`.
**Result**: `customer_sum` won't match `customer_summary`.
---
## Benefits and Trade-offs {#benefits-tradeoffs}
### Benefits
**1. Granular Test Isolation**
- Test individual transformations without running entire model
- Clear failure messages ("customer_agg failed" vs "model failed")
- Easier debugging and faster iteration
**2. Refactoring Confidence**
- Change intermediate CTE with targeted tests
- Catch regressions immediately
- Refactor without fear
**3. Better Code Understanding**
- CTEs become self-documenting with tests
- New team members understand logic through examples
- Living documentation of business rules
**4. Test-Driven Development for SQL**
- Write CTE test first
- Implement CTE logic
- Iterate until test passes
- Classic TDD workflow in dbt
**5. Zero Maintenance Overhead**
- Generated files are ephemeral
- Always in sync with source models
- No manual updates needed
**6. Performance**
- Generated tests run in parallel (dbt native)
- Isolated CTEs often faster than full model
- Fast feedback loop
### Trade-offs
**1. Increased Test Count**
- More tests = longer test runs
- Mitigated by: Parallel execution, targeted selectors
**2. Generated Code Clutter**
- `__cte_tests/` directories in project
- Mitigated by: Clear naming, gitignore if desired, ephemeral (cleaned each run)
**3. Learning Curve**
- New syntax (`::` convention)
- New concept (CTE-level testing)
- Mitigated by: Clear documentation, examples, intuitive design
**4. Edge Cases**
- Jinja conditionals with same CTE name can be ambiguous
- Mitigated by: Comment-based disambiguation, clear error messages
**5. Generator Dependency**
- Requires experimental feature flag
- Adds complexity to test workflow
- Mitigated by: Graceful degradation (disabled tests work as regular tests), clear enablement process
### When to Use CTE Tests
**Good candidates**:
- Complex aggregations
- Multi-step transformations
- Business logic in CTEs
- Models with 3+ CTEs
- Frequently modified CTEs
**Skip CTE tests for**:
- Simple `select *` pass-through CTEs
- Single-CTE models
- Already well-tested models
- Rarely changed legacy code
---
## Proposal for dbt Standard {#proposal-for-dbt-standard}
### Why This Should Be in dbt Core
**1. Universal Problem**
Every dbt project with complex models faces this challenge. It's not specific to one company or use case.
**2. Standards-Based Approach**
Uses existing dbt unit test v2 format. Minimal new syntax (`::` convention). Natural extension of current testing paradigm.
**3. Low Barrier to Adoption**
- Opt-in via config flag
- Works alongside existing tests
- No breaking changes
- Incremental adoption (add CTE tests as needed)
**4. Proven Implementation**
- Comprehensive test coverage (119 tests passing)
- Handles edge cases (comments, Jinja, quotes, etc.)
- Production-ready code quality
### Proposed dbt Core Integration
**Phase 1: Experimental Feature** (Current State)
- Behind `EXPERIMENTAL_FEATURES` flag
- Community feedback and iteration
- Refinement of `::` syntax and conventions
**Phase 2: Core Integration**
```yaml
# dbt_project.yml
unit-tests:
cte-tests:
enabled: true # Enable CTE test generation
target-directory: "models/.cte_tests" # Configurable output location
cleanup: true # Clean generated files on each run
```
**Phase 3: Enhanced Features**
- IDE support (syntax highlighting for `model::cte`)
- dbt Cloud integration
- Performance optimizations
- Expanded test selectors
### Standard Syntax Proposal
**1. Test Marker**
```yaml
config:
cte_test: true # Marks for generation
enabled: false # Prevents double execution
```
**2. Model Specification**
```yaml
model: model_name::cte_name
```
**3. CTE Mocking**
```yaml
given:
- input: ::cte_name
```
**4. Generated Files**
```
models/.cte_tests/
{model}__{cte}__{hash}.sql
unit_tests/.cte_tests/
{model}__{cte}__{hash}_unit_tests.yml
```
### Backwards Compatibility
**Original test remains valid**:
```yaml
- name: test_customer_agg
model: customers::customer_agg
config:
cte_test: true
enabled: false
```
**Behavior**:
- With `EXPERIMENTAL_FEATURES=true`: Generates and runs CTE test automatically
- With `EXPERIMENTAL_FEATURES=false`: Test is disabled (skipped) - no generation
- No feature flag set: Same as false - test skipped
**No Breaking Changes**: Existing unit tests without `cte_test: true` work exactly as before.
### Community Value
**For Organizations**:
- Standardized approach to SQL testing
- Easier onboarding (standard patterns)
- Better code quality across teams
**For Individual Developers**:
- Faster development iterations
- Confidence in refactoring
- Clear documentation through tests
**For dbt Ecosystem**:
- Raises quality bar for dbt projects
- Encourages test-driven development
- Strengthens dbt's position as analytics engineering platform
---
## Implementation Reference
### Core Files
**Generator Module** (`src/dbt_core_mcp/cte_generator.py`):
- `generate_cte_tests(project_dir)` - Main entry point
- `generate_cte_model()` - Extract and truncate model
- `generate_cte_test()` - Create test YAML
- `replace_cte_with_mock()` - Apply CTE mocking
- `is_position_in_comment()` - Comment detection
- `rows_to_sql()` - Fixture to SQL conversion
- `parse_csv_fixture()` - CSV parsing
**Integration** (`src/dbt_core_mcp/tools/test_models.py`):
```python
if state.experimental_features and state.project_dir:
try:
cte_count = generate_cte_tests(state.project_dir)
if cte_count > 0:
logger.info(f"Generated {cte_count} CTE tests")
except Exception as e:
logger.warning(f"CTE test generation failed: {e}")
# Continue with regular tests
```
**Server Config** (`src/dbt_core_mcp/server.py`):
```python
experimental_features = os.getenv("EXPERIMENTAL_FEATURES", "false").lower() == "true"
```
### Test Coverage
**21 unit tests** (`tests/test_cte_generator.py`):
- SQL generation (7 tests)
- CSV parsing (3 tests)
- Comment detection (7 tests)
- Integration (4 tests)
**Plus 11 integration tests** in `examples/jaffle_shop/`:
- Simple CTE test
- CTE mocking
- Multiple CTE mocks
- Empty fixtures (dict and CSV)
- CSV with data
- Partial ref override
- Different CTE depths
- Final CTE (last in chain)
### Performance Metrics
**Generator Speed**:
- 11 CTE tests generated in < 1 second
- Minimal impact on `test_models` runtime
**Test Execution**:
- 18 total tests (3 generic + 4 manual unit + 11 generated CTE)
- Runtime: ~1.7 seconds (parallel execution)
---
## Conclusion
CTE unit testing represents a significant advancement in dbt testing capabilities. By automatically generating isolated test models from simple YAML definitions, developers can:
1. Test complex SQL transformations granularly
2. Debug failures faster with clear isolation
3. Refactor confidently with comprehensive coverage
4. Follow TDD practices in analytics engineering
The implementation is production-ready, well-tested, and designed for inclusion in dbt Core as a standard feature. The `::` syntax convention is intuitive, the integration is seamless, and the benefits are universal.
**This is the future of SQL testing in dbt.**