Skip to main content
Glama
NiclasOlofsson

DBT Core MCP Server

load_seeds

Load CSV seed files into database tables to provide reference data for models and tests. Use state-based selection to only load changed seeds.

Instructions

Load seed data (CSV files) from seeds/ directory into database tables.

When to use: Run this before building models or tests that depend on reference data. Seeds must be loaded before models that reference them can execute.

What are seeds: CSV files containing static reference data (country codes, product categories, lookup tables, etc.). Unlike models (which are .sql files), seeds are CSV files that are loaded directly into database tables.

State-based selection modes (detects changed CSV files):

  • select_state_modified: Load only seeds modified since last successful run (state:modified)

  • select_state_modified_plus_downstream: Load modified + downstream dependencies (state:modified+) Note: Requires select_state_modified=True

Manual selection (alternative to state-based):

  • select: dbt selector syntax (e.g., "raw_customers", "tag:lookup")

  • exclude: Exclude specific seeds

Important: Change detection for seeds works via file hash comparison:

  • Seeds < 1 MiB: Content hash is compared (recommended)

  • Seeds >= 1 MiB: Only file path changes are detected (content changes ignored) For large seeds, use manual selection or run all seeds.

Args: select: Manual selector for seeds exclude: Exclude selector select_state_modified: Use state:modified selector (changed seeds only) select_state_modified_plus_downstream: Extend to state:modified+ (changed + downstream) full_refresh: Truncate and reload seed tables (default behavior) show: Show preview of loaded data state: Shared state object injected by FastMCP

Returns: Seed results with status and loaded seed info

See also: - run_models(): Execute .sql model files (not CSV seeds) - build_models(): Runs both seeds and models together in DAG order - test_models(): Run tests (requires seeds to be loaded first if tests reference them)

Examples: # Before running tests that depend on reference data load_seeds() test_models(select="test_customer_country_code")

# After adding a new CSV lookup table
load_seeds(select="new_product_categories")

# Fix "relation does not exist" errors from models referencing seeds
load_seeds()  # Load missing seed tables first
run_models(select="stg_orders")

# Incremental workflow: only reload what changed
load_seeds(select_state_modified=True)

# Full refresh of a specific seed
load_seeds(select="country_codes", full_refresh=True)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
selectNo
excludeNo
select_state_modifiedNo
select_state_modified_plus_downstreamNo
full_refreshNo
showNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It details state-based selection modes, file hash change detection, limitations for large seeds (>=1 MiB), and the effect of full_refresh. This fully discloses behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with sections, bullet points, and examples. Every sentence is informative without unnecessary verbosity. It earns its length by providing comprehensive guidance.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and 0% schema description coverage, the description covers all necessary context: what seeds are, when to use, selection mechanisms, large file caveats, examples, and return value. It is complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description explains all six parameters in detail, including select, exclude, select_state_modified, full_refresh, show, and state. It adds meaning beyond the schema's type/default information.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool loads seed data (CSV files) into database tables. It distinguishes from siblings like run_models, build_models, test_models by explicitly noting seeds are CSV files and providing a 'See also' section with alternatives.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Includes a 'When to use' section advising to run before building models or tests that depend on reference data. Also provides examples and mentions alternatives like run_models and build_models, giving clear context for when to use this tool vs. others.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/NiclasOlofsson/dbt-core-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server