dbt MCP Server

CLAUDE.md•7.21 kB

# Claude Code Integration Guide - Modern Data Stack with MCP This document provides comprehensive guidance for Claude Code to understand and interact with the modern data stack through specialized MCP server integration. ## 🏗️ Architecture Overview ### Production-Ready Data Stack This platform implements a **production-ready modern data stack** combining: **Core Components**: - **Meltano**: ELT orchestration platform for data pipelines - **DuckDB**: High-performance in-process analytics database - **dbt Core**: Data transformation framework with best practices - **MCP Integration**: Claude Code server for intelligent development assistance **Architecture Flow**: ``` CSV Data → Meltano (Extract/Load) → DuckDB → dbt Core (Transform) → Analytics Tables ↓ Claude Code MCP Server (Intelligent Development Assistance) ``` ## 🚀 MCP Server Capabilities ### dbt MCP Server Integration **Primary Server**: `dbt-data-stack` **Location**: `/Users/ajdoyle/claude-data-stack-mcp/mcp_servers/dbt_server.py` **Available Tools**: - `dbt_run`: Execute dbt models and transformations - `dbt_test`: Run data quality tests and validations - `dbt_compile`: Compile models without execution for validation - `dbt_build`: Complete build including models and tests - `dbt_list`: Project discovery and model metadata - `dbt_show`: Execute arbitrary SQL queries against warehouse **Configuration**: ```json { "mcpServers": { "dbt-data-stack": { "command": "python", "args": ["/Users/ajdoyle/claude-data-stack-mcp/mcp_servers/dbt_server.py"], "env": { "DBT_PROJECT_DIR": "/Users/ajdoyle/claude-data-stack-mcp/transform", "DBT_PROFILES_DIR": "/Users/ajdoyle/claude-data-stack-mcp/transform/profiles/duckdb", "DBT_PROFILE": "data_stack", "DBT_TARGET": "dev", "DUCKDB_PATH": "/Users/ajdoyle/claude-data-stack-mcp/data/warehouse/data_stack.duckdb" } } } } ``` ## 📊 Data Platform Structure ### Key Directories ``` ├── data/ │ ├── sample_data.csv # Sample datasets for pipeline testing │ └── warehouse/ # DuckDB database files │ └── data_stack.duckdb # Main analytics warehouse ├── transform/ # dbt project root │ ├── models/ # dbt models │ │ ├── staging/ # Raw data cleaning │ │ │ ├── stg_employees.sql │ │ │ └── sources.yml │ │ └── marts/ # Business logic │ │ ├── dim_employees.sql │ │ └── agg_department_stats.sql │ ├── profiles/duckdb/ # Project-contained profiles │ │ └── profiles.yml │ └── dbt_project.yml # dbt configuration ├── mcp_servers/ # MCP server implementations │ ├── dbt_server.py # Claude Code dbt integration │ └── claude_config.json # MCP server configuration └── meltano.yml # ELT pipeline configuration ``` ## 🔧 Usage Patterns ### Data Pipeline Operations ```bash # Complete data pipeline execution meltano run tap-csv target-duckdb cd transform && DBT_PROFILES_DIR=./profiles/duckdb dbt run --profile data_stack ``` ### MCP-Enabled Development Workflows **Through Claude Code**: - **Model Development**: "Create a new dbt model for customer analytics" - **Data Quality**: "Add data quality tests for the employee dimension" - **Query Optimization**: "Analyze and optimize this SQL query performance" - **Pipeline Debugging**: "Troubleshoot this dbt test failure" ### Direct MCP Tool Usage ```python # Execute dbt models await mcp_dbt.dbt_run(selector="marts", full_refresh=False) # Run data quality tests await mcp_dbt.dbt_test(selector="staging") # Execute ad-hoc SQL queries await mcp_dbt.dbt_show( sql_query="SELECT department, COUNT(*) FROM {{ ref('stg_employees') }} GROUP BY department", limit=100 ) ``` ## 📈 Data Models & Quality ### Staging Layer - **`stg_employees`**: Clean, typed employee data from raw CSV source ### Analytics Layer - **`dim_employees`**: Employee dimension with calculated salary tiers - **`agg_department_stats`**: Department-level metrics and aggregations ### Data Quality Framework - **Unique constraints**: Employee IDs must be unique across all records - **Not null constraints**: Critical fields cannot contain null values - **Referential integrity**: Foreign key relationships maintained ## 🚀 Quick Start for Claude Code ### 1. MCP Server Activation The dbt MCP server is automatically configured in `mcp_servers/claude_config.json`. Claude Code will detect and connect to the server automatically. ### 2. Common Operations ``` # Analyze data model relationships "Show me the lineage for the dim_employees model" # Execute transformations "Run the marts layer models and validate data quality" # Query the warehouse "Show me department statistics from the analytics tables" # Debug pipeline issues "Why is the agg_department_stats model failing tests?" ``` ### 3. Development Workflow 1. **Extract & Load**: Data flows from CSV through Meltano to DuckDB 2. **Transform**: dbt models create staging views and analytics tables 3. **Validate**: Data quality tests ensure integrity 4. **Query**: Ad-hoc analysis through Claude Code MCP integration ## 🛡️ Configuration & Security ### Environment Variables All sensitive configuration is handled through environment variables: ```env DBT_PROJECT_DIR=/Users/ajdoyle/claude-data-stack-mcp/transform DBT_PROFILES_DIR=/Users/ajdoyle/claude-data-stack-mcp/transform/profiles/duckdb DUCKDB_PATH=/Users/ajdoyle/claude-data-stack-mcp/data/warehouse/data_stack.duckdb ``` ### MCP Server Security - **Local execution**: All MCP servers run locally for security - **File system isolation**: Database files isolated in `data/warehouse/` - **Profile isolation**: dbt profiles contained within project structure ## 📚 Documentation & Support ### Available Documentation - 🚀 [Quick Start Guide](docs/QUICK_START.md) - 5-minute setup - 📚 [API Reference](docs/API_REFERENCE.md) - Complete tool documentation - 💡 [Usage Examples](docs/USAGE_EXAMPLES.md) - Practical workflows - 🔧 [Integration Guide](docs/MCP_INTEGRATION.md) - Detailed configuration ### Tool Versions - **Meltano 3.8.0** - DataOps platform for ELT pipelines - **DuckDB 1.3.2** - High-performance analytics database - **dbt Core 1.10.4** - Data transformation framework - **dbt-duckdb 1.9.4** - DuckDB adapter for dbt ## ✅ Status: Production Ready + MCP Enhanced This modern data stack has been systematically implemented and validated through comprehensive testing. All components use the latest compatible versions and follow industry best practices. Enhanced with Claude Code MCP integration for intelligent development assistance and automated workflows. **Key Achievements**: - ✅ Complete ELT pipeline implementation - ✅ Data quality framework with automated testing - ✅ MCP server integration for Claude Code assistance - ✅ Project-contained configuration for reproducibility - ✅ Production-ready with comprehensive documentation

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ajdoyl2/claude-data-stack-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server