Skip to main content
Glama
NiclasOlofsson

DBT Core MCP Server

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault

No arguments

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
analyze_impactA

Analyze the impact of changing any dbt resource with auto-detection.

This unified tool works across all resource types (models, sources, seeds, snapshots, etc.) showing all downstream dependencies that would be affected by changes. Provides actionable recommendations for running affected resources.

Args: name: Resource name. For sources, use "source_name.table_name" or just "table_name" Examples: "stg_customers", "jaffle_shop.orders", "raw_customers" resource_type: Optional filter to narrow search: - "model": Data transformation models - "source": External data sources - "seed": CSV reference data files - "snapshot": SCD Type 2 historical tables - "test": Data quality tests - "analysis": Ad-hoc analysis queries - None: Auto-detect (searches all types)

Returns: Impact analysis with: - List of affected models by distance - Count of affected tests and other resources - Total impact statistics - Resources grouped by distance from changed resource - Recommended dbt command to run affected resources - Human-readable impact assessment message If multiple matches found, returns all matches for LLM to process.

Raises: ValueError: If resource not found

Examples: analyze_impact("stg_customers") -> auto-detect and show impact analyze_impact("jaffle_shop.orders", "source") -> impact of source change analyze_impact("raw_customers", "seed") -> impact of seed data change

build_modelsA

Run dbt build (execute models and tests together in correct dependency order).

When to use: This is the recommended "do everything" command that runs seeds, models, snapshots, and tests in the correct order based on your DAG. It automatically handles dependencies, so you don't need to run load_seeds() → run_models() → test_models() separately.

How it works: Executes resources in dependency order:

  1. Seeds (if selected)

  2. Models (with their upstream dependencies)

  3. Tests (after their parent models complete)

  4. Snapshots (if selected)

State-based selection modes (uses dbt state:modified selector):

  • select_state_modified: Build only resources modified since last successful run (state:modified)

  • select_state_modified_plus_downstream: Build modified + downstream dependencies (state:modified+) Note: Requires select_state_modified=True

Manual selection (alternative to state-based):

  • select: dbt selector syntax (e.g., "customers", "tag:mart", "stg_*")

  • exclude: Exclude specific models

Args: select: Manual selector exclude: Exclude selector select_state_modified: Use state:modified selector (changed resources only) select_state_modified_plus_downstream: Extend to state:modified+ (changed + downstream) full_refresh: Force full refresh of incremental models resource_types: Filter by resource types (model, test, seed, snapshot) fail_fast: Stop execution on first failure state: Shared state object injected by FastMCP

Returns: Build results with status, models run/tested, and timing info

See also: - run_models(): Run only models (no tests) - test_models(): Run only tests - load_seeds(): Run only seeds

Examples: # Full project build (first-time setup or comprehensive run) build_models()

# Build only what changed (efficient incremental workflow)
build_models(select_state_modified=True)

# Build changed resources + everything downstream
build_models(select_state_modified=True, select_state_modified_plus_downstream=True)

# Build specific model and its dependencies + tests
build_models(select="customers")

# Build all marts (includes their seed dependencies automatically)
build_models(select="tag:mart")

# Quick feedback: stop on first test failure
build_models(fail_fast=True)
demo_uiA

Render the demo UI resource (resource://demo/hello).

get_column_lineageA

Trace column-level lineage through SQL transformations.

Uses sqlglot to parse compiled SQL and track how columns flow through:

  • CTEs and subqueries

  • JOINs and aggregations

  • Transformations (calculations, CASE statements, etc.)

  • Window functions

This provides detailed column-to-column dependencies that model-level lineage cannot capture.

Args: model_name: Name or unique_id of the dbt model to analyze column_name: Name of the column to trace direction: Direction to trace lineage: - "upstream": Which source columns feed into this column - "downstream": Which downstream columns use this column - "both": Full bidirectional column lineage depth: Maximum levels to traverse (None for unlimited) - depth=1: Immediate column dependencies only - depth=2: Dependencies + their dependencies - None: Full dependency tree

Returns: Column lineage information including: - Source columns this column depends on (upstream) - Downstream columns that depend on this column - Transformations and derivations - CTE transformation paths (via_ctes, transformations) - dbt resource mapping where available

Each dependency includes:
- column: Column name
- table: Source table name
- schema: Source schema (if available)
- database: Source database (if available)
- via_ctes: List of CTE names in transformation order
- transformations: Transformation details per CTE step
  - cte: CTE name
  - column: Column name at this step
  - expression: SQL expression (truncated to 200 chars)

Raises: ValueError: If model not found, column not found, or SQL parse fails RuntimeError: If sqlglot is not installed

Examples: # Find which source columns feed into revenue get_column_lineage("fct_sales", "revenue", "upstream")

# See what downstream models use customer_id
get_column_lineage("dim_customers", "customer_id", "downstream")

# Full bidirectional lineage for a column
get_column_lineage("fct_orders", "order_total", "both")

Note: Requires sqlglot package. Install with: pip install sqlglot The model must be compiled (run 'dbt compile' first).

get_lineageA

Get lineage (dependency tree) for any dbt resource with auto-detection.

This unified tool works across all resource types (models, sources, seeds, snapshots, etc.) showing upstream and/or downstream dependencies with configurable depth.

Args: name: Resource name. For sources, use "source_name.table_name" or just "table_name" Examples: "customers", "jaffle_shop.orders", "raw_customers" resource_type: Optional filter to narrow search: - "model": Data transformation models - "source": External data sources - "seed": CSV reference data files - "snapshot": SCD Type 2 historical tables - "test": Data quality tests - "analysis": Ad-hoc analysis queries - None: Auto-detect (searches all types) direction: Lineage direction: - "upstream": Show where data comes from (parents) - "downstream": Show what depends on this resource (children) - "both": Show full lineage (default) depth: Maximum levels to traverse (None for unlimited) - depth=1: Immediate dependencies only - depth=2: Dependencies + their dependencies - None: Full dependency tree

Returns: Lineage information with upstream/downstream nodes and statistics. If multiple matches found, returns all matches for LLM to process.

Raises: ValueError: If resource not found or invalid direction

Examples: get_lineage("customers") -> auto-detect and show full lineage get_lineage("customers", "model", "upstream") -> where customers model gets data get_lineage("jaffle_shop.orders", "source", "downstream", 2) -> 2 levels of dependents

get_project_infoB

Get information about the dbt project with optional diagnostics.

Args: ctx: MCP context (provided by FastMCP) run_debug: Run dbt debug to validate environment and test connection (default: True) state: Shared state object injected by FastMCP

Returns: Dictionary with project information and diagnostic results

get_resource_infoA

Get detailed information about any dbt resource (model, source, seed, snapshot, test, etc.).

This unified tool works across all resource types, auto-detecting the resource or filtering by type. Designed for LLM consumption - returns complete data even when multiple matches exist.

Args: name: Resource name. For sources, use "source_name.table_name" or just "table_name" resource_type: Optional filter to narrow search: - "model": Data transformation models - "source": External data sources - "seed": CSV reference data files - "snapshot": SCD Type 2 historical tables - "test": Data quality tests - "analysis": Ad-hoc analysis queries - None: Auto-detect (searches all types) include_database_schema: If True (default), query actual database table schema for models/seeds/snapshots/sources and add as 'database_columns' field include_compiled_sql: If True (default), include compiled SQL with Jinja resolved ({{ ref() }}, {{ source() }} → actual table names). Only applicable to models. Will trigger dbt compile if not already compiled. Set to False to skip compilation. state: Shared state object injected by FastMCP

Returns: Resource information dictionary. If multiple matches found, returns: {"multiple_matches": True, "matches": [...], "message": "..."}

Raises: ValueError: If resource not found

install_depsA

Install dbt packages defined in packages.yml.

This tool enables interactive workflow where an LLM can:

  1. Suggest using a dbt package (e.g., dbt_utils)

  2. Edit packages.yml to add the package

  3. Run install_deps() to install it

  4. Write code that uses the package's macros

This completes the recommendation workflow without breaking conversation flow.

When to use:

  • After adding/modifying packages.yml

  • Before using macros from external packages

  • When setting up a new dbt project

Package Discovery: After installation, use list_resources(resource_type="macro") to verify installed packages and discover available macros.

Returns: Installation results with status and installed packages

Example workflow: User: "Create a date dimension table" LLM: 1. Checks: list_resources(type="macro") -> no dbt_utils 2. Edits: packages.yml (adds dbt_utils package) 3. Runs: install_deps() (installs package) 4. Creates: models/date_dim.sql (uses dbt_utils.date_spine)

Note: This is an interactive development tool, not infrastructure automation. It enables the LLM to act on its own recommendations mid-conversation.

list_resourcesA

List all resources in the dbt project with optional filtering by type.

This unified tool provides a consistent view across all dbt resource types. Returns simplified resource information optimized for LLM consumption.

Args: resource_type: Optional filter to narrow results: - "model": Data transformation models - "source": External data sources - "seed": CSV reference data files - "snapshot": SCD Type 2 historical tables - "test": Data quality tests - "analysis": Ad-hoc analysis queries - "macro": Jinja macros (includes macros from installed packages) - None: Return all resources (default)

Returns: List of resource dictionaries with consistent structure across types. Each resource includes: name, unique_id, resource_type, description, tags, etc.

Package Discovery: Use resource_type="macro" to discover installed dbt packages. Macros follow the naming pattern: macro.{package_name}.{macro_name}

Example - Check if dbt_utils is installed:
    macros = list_resources("macro")
    has_dbt_utils = any(m["unique_id"].startswith("macro.dbt_utils.") for m in macros)

Example - List all installed packages:
    macros = list_resources("macro")
    packages = {m["unique_id"].split(".")[1] for m in macros
               if m["unique_id"].startswith("macro.") and
               m["unique_id"].split(".")[1] != "dbt"}

Examples: list_resources() -> all resources list_resources("model") -> only models list_resources("source") -> only sources list_resources("test") -> only tests list_resources("macro") -> all macros (discover installed packages)

load_seedsA

Load seed data (CSV files) from seeds/ directory into database tables.

When to use: Run this before building models or tests that depend on reference data. Seeds must be loaded before models that reference them can execute.

What are seeds: CSV files containing static reference data (country codes, product categories, lookup tables, etc.). Unlike models (which are .sql files), seeds are CSV files that are loaded directly into database tables.

State-based selection modes (detects changed CSV files):

  • select_state_modified: Load only seeds modified since last successful run (state:modified)

  • select_state_modified_plus_downstream: Load modified + downstream dependencies (state:modified+) Note: Requires select_state_modified=True

Manual selection (alternative to state-based):

  • select: dbt selector syntax (e.g., "raw_customers", "tag:lookup")

  • exclude: Exclude specific seeds

Important: Change detection for seeds works via file hash comparison:

  • Seeds < 1 MiB: Content hash is compared (recommended)

  • Seeds >= 1 MiB: Only file path changes are detected (content changes ignored) For large seeds, use manual selection or run all seeds.

Args: select: Manual selector for seeds exclude: Exclude selector select_state_modified: Use state:modified selector (changed seeds only) select_state_modified_plus_downstream: Extend to state:modified+ (changed + downstream) full_refresh: Truncate and reload seed tables (default behavior) show: Show preview of loaded data state: Shared state object injected by FastMCP

Returns: Seed results with status and loaded seed info

See also: - run_models(): Execute .sql model files (not CSV seeds) - build_models(): Runs both seeds and models together in DAG order - test_models(): Run tests (requires seeds to be loaded first if tests reference them)

Examples: # Before running tests that depend on reference data load_seeds() test_models(select="test_customer_country_code")

# After adding a new CSV lookup table
load_seeds(select="new_product_categories")

# Fix "relation does not exist" errors from models referencing seeds
load_seeds()  # Load missing seed tables first
run_models(select="stg_orders")

# Incremental workflow: only reload what changed
load_seeds(select_state_modified=True)

# Full refresh of a specific seed
load_seeds(select="country_codes", full_refresh=True)
query_databaseA

Execute a SQL query against the dbt project's database.

This tool compiles and runs SQL with Jinja templating support, allowing you to use {{ ref('model') }} and {{ source('src', 'table') }} in your queries.

SQL Templating:

  • Use {{ ref('model_name') }} to reference dbt models

  • Use {{ source('source_name', 'table_name') }} to reference source tables

  • dbt compiles these to actual table names before execution

CTE Querying (LLM quick reference)

  • Always pass cte_name + model_name in parameters (not in SQL)

  • Always write a normal SELECT ... FROM __cte__ ...

    • __cte__ or {{ cte }} is replaced with the CTE name

  • What happens under the hood

    • Extracts the target CTE plus upstream CTEs from the model

    • Runs your query against that extracted CTE

  • Templating

    • dbt resolves all {{ ref() }} / {{ source() }} automatically; no manual table names

  • Invalid syntax to avoid

    • {{ ref('model', cte='name') }} does not exist; always use cte_name + model_name

Output Management:

  • For large result sets (>100 rows), use output_file to save results

  • If output_file is omitted, all data returns inline (may consume large context)

  • output_file is automatically created with parent directories

  • Relative paths are resolved relative to the workspace root

Output Formats:

  • json (default): Returns data as JSON array of objects

  • csv: Returns comma-separated values with header row

  • tsv: Returns tab-separated values with header row

  • CSV/TSV formats use proper quoting (only when necessary) and are Excel-compatible

Args: sql: SQL query with Jinja templating: {{ ref('model') }}, {{ source('src', 'table') }} For exploratory queries, include LIMIT. For aggregations/counts, omit it. When using cte_name/model_name, provide a full SELECT/WITH query that selects from __cte__ (or {{ cte }}), which is replaced with the CTE name. output_file: Optional file path to save results. Recommended for large result sets (>100 rows). Relative paths are resolved relative to the workspace root. If provided, only metadata is returned (no preview for CSV/TSV). If omitted, all data is returned inline (may consume large context). output_format: Output format - "json" (default), "csv", or "tsv" cte_name: Optional CTE name to query from a model (requires model_name) model_name: Optional model name containing the CTE (required when cte_name is specified) state: Shared state object injected by FastMCP

Returns: JSON inline: {"status": "success", "row_count": N, "rows": [...], "elapsed_time": X.XX} JSON file: {"status": "success", "row_count": N, "saved_to": "path", "preview": [...], "elapsed_time": X.XX} CSV/TSV inline: {"status": "success", "row_count": N, "format": "csv", "csv": "...", "elapsed_time": X.XX} CSV/TSV file: {"status": "success", "row_count": N, "format": "csv", "saved_to": "path", "elapsed_time": X.XX}

Note: elapsed_time is in seconds and represents the total query execution time including compilation

Raises: RuntimeError: If query execution fails ValueError: If invalid CTE/model parameters provided

Examples: # Simple query with ref() query_database(sql="SELECT * FROM {{ ref('customers') }} LIMIT 10")

# Query with source()
query_database(sql="SELECT * FROM {{ source('jaffle_shop', 'orders') }} LIMIT 5")

# Aggregation (no LIMIT needed)
query_database(sql="SELECT COUNT(*) as total FROM {{ ref('customers') }}")

# Query a specific CTE from a model
query_database(
    cte_name="customer_agg",
    model_name="customers",
    sql="SELECT * FROM __cte__ LIMIT 10"
)

# Query a CTE with filtering
query_database(
    cte_name="customer_agg",
    model_name="customers",
    sql="SELECT * FROM __cte__ WHERE order_count > 5 LIMIT 20"
)

# Query a CTE with aggregation (full SELECT)
query_database(
    cte_name="customer_agg",
    model_name="customers",
    sql="SELECT customer_id, COUNT(*) AS cnt FROM __cte__ GROUP BY customer_id"
)

# WRONG - Do NOT use ref() with cte parameter (does not exist):
# query_database(sql="SELECT * FROM {{ ref('model', cte='cte_name') }}")
#
# CORRECT - Use cte_name and model_name parameters instead:
# query_database(cte_name="cte_name", model_name="model", sql="SELECT * FROM __cte__ LIMIT 10")

# Save large results to file
query_database(
    sql="SELECT * FROM {{ ref('orders') }}",
    output_file="temp_auto/orders_export.json"
)

# Export as CSV
query_database(
    sql="SELECT * FROM {{ ref('customers') }}",
    output_file="temp_auto/customers.csv",
    output_format="csv"
)
run_modelsA

Run dbt models (compile SQL and execute against database).

What are models: SQL files (.sql) containing SELECT statements that define data transformations. Models are compiled and executed to create/update tables and views in your database.

Important: This tool runs models only (SQL files). For CSV seed files, use load_seeds(). For running everything together (seeds + models + tests), use build_models().

State-based selection modes (uses dbt state:modified selector):

  • select_state_modified: Run only models modified since last successful run (state:modified)

  • select_state_modified_plus_downstream: Run modified + downstream dependencies (state:modified+) Note: Requires select_state_modified=True

Manual selection (alternative to state-based):

  • select: dbt selector syntax (e.g., "customers", "tag:mart", "stg_*")

  • exclude: Exclude specific models

Args: select: Manual selector (e.g., "customers", "tag:mart", "path:marts/*") exclude: Exclude selector (e.g., "tag:deprecated") select_state_modified: Use state:modified selector (changed models only) select_state_modified_plus_downstream: Extend to state:modified+ (changed + downstream) full_refresh: Force full refresh of incremental models fail_fast: Stop execution on first failure check_schema_changes: Detect schema changes and recommend downstream runs cache_selected_only: Only cache schemas for selected models (default True for performance) state: Shared state object injected by FastMCP

Returns: Execution results with status, models run, timing info, and optional schema_changes

See also: - seed_data(): Load CSV files (must run before models that reference them) - build_models(): Run models + tests together in DAG order - test_models(): Run tests after models complete

Examples: # Run a specific model run_models(select="customers")

# After loading seeds, run dependent models
seed_data()
run_models(select="stg_orders")

# Incremental: run only what changed
run_models(select_state_modified=True)

# Run changed models + everything downstream
run_models(select_state_modified=True, select_state_modified_plus_downstream=True)

# Full refresh marts (rebuild from scratch)
run_models(select="tag:mart", full_refresh=True)
snapshot_modelsA

Snapshot models (capture historical changes - SCD Type 2).

Snapshots capture historical changes in data, enabling you to track slowly changing dimensions over time. This is particularly useful for maintaining accurate historical records in data warehouses.

When to use: To track changes in slowly changing dimensions (SCD Type 2). For example, tracking customer address changes over time while preserving history.

How it works: dbt compares current source data with existing snapshot table, identifies changes, and inserts new rows with validity timestamps (dbt_valid_from, dbt_valid_to, dbt_updated_at). Original rows are closed by setting dbt_valid_to.

Args: select: dbt selector syntax (e.g., "snapshot_name", "tag:daily") exclude: Exclude specific snapshots state: Shared state object injected by FastMCP

Returns: Snapshot results with status and timing info

Examples: # Run all snapshots snapshot_models()

# Run specific snapshot
snapshot_models(select="customers_snapshot")

# Run tagged snapshots
snapshot_models(select="tag:daily")
test_modelsA

Run dbt tests on models and sources.

When to use: After running models to validate data quality. Tests check constraints like uniqueness, not-null, relationships, and custom data quality rules.

Important: Ensure seeds and models are built before running tests that depend on them.

State-based selection modes (uses dbt state:modified selector):

  • select_state_modified: Test only models modified since last successful run (state:modified)

  • select_state_modified_plus_downstream: Test modified + downstream dependencies (state:modified+) Note: Requires select_state_modified=True

Manual selection (alternative to state-based):

  • select: dbt selector syntax (e.g., "customers", "tag:mart", "test_type:generic")

  • exclude: Exclude specific tests

Args: select: Manual selector for tests/models to test exclude: Exclude selector select_state_modified: Use state:modified selector (changed models only) select_state_modified_plus_downstream: Extend to state:modified+ (changed + downstream) fail_fast: Stop execution on first failure keep_cte_tests: Keep generated CTE test files for debugging (default: False) state: Shared state object injected by FastMCP

Returns: Test results with status and failures

See also: - run_models(): Execute models before testing them - build_models(): Run models + tests together automatically - load_seeds(): Load seeds if tests reference seed data

Examples: # After building a model, test it run_models(select="customers") test_models(select="customers")

# Test only generic tests (not singular)
test_models(select="test_type:generic")

# Test everything that changed
test_models(select_state_modified=True)

# Stop on first failure for quick feedback
test_models(fail_fast=True)

# Keep CTE test files for debugging
test_models(keep_cte_tests=True)

Note: Unit test failures show diffs in the "daff" tabular format: @@ = column headers +++ = row in actual, not in expected (extra row) --- = row in expected, not in actual (missing row) → = row with modified cell(s), shown as old_value→new_value ... = omitted matching rows Full format spec: https://paulfitz.github.io/daff-doc/spec.html

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription
Demo UI
Demo UI (Legacy)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/NiclasOlofsson/dbt-core-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server