Which integrations are available for this server?

Allows investigation of dbt test failures by reading the dbt manifest and run_results, enabling root cause analysis through lineage tracing and data profiling.

How do I use dbt-investigator?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@dbt-investigator Why did not_null_fct_transactions_merchant fail?" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

dbt-investigator

by ARAVINDHRAJA123

Overview Schema Related Servers Score Discussions

Python

Local

Data Quality Agent

An agentic AI system that automatically investigates dbt test failures, traces the root cause through BigQuery lineage, and generates a plain-English incident report — cutting investigation time from hours to minutes.

📄 Sample incident report

Architecture

Data Quality Agent Architecture

Related MCP server: aegis-dq

What it does

When a dbt test fails you normally get a cryptic error message. This agent:

Fetches the failing rows from BigQuery — sees the actual bad data
Reads the dbt manifest — understands the full lineage graph
Traces upstream — profiles columns in parent models and source tables
Identifies the root cause — finds where the bad data entered the pipeline
Writes an incident report — plain-English root cause, lineage trace, recommended fix, severity

not_null_fct_transactions_merchant failed (23 rows)
        │
        ▼
Agent fetches failing rows → reads fct lineage → traces to int_ → traces to stg_ → checks raw source
        │
        ▼
Root cause: 23 rows in raw.bank_transactions have NULL narration.
Merchant extraction returns NULL when narration is NULL.
Fix: Add COALESCE(narration, '') in stg_bank__transactions.
Severity: HIGH

Three trigger modes

1 — CLI

python agent.py \
  --test not_null_fct_transactions_merchant \
  --model fct_transactions \
  --column merchant \
  --verbose

2 — Webhook (Airflow or any HTTP caller)

python server.py   # starts on port 5051

curl -X POST http://localhost:5051/investigate \
  -H "Content-Type: application/json" \
  -d '{"test_name": "not_null_fct_transactions_merchant", "model": "fct_transactions", "column": "merchant"}'

Point your Airflow DAG's on_failure_callback at this endpoint.

3 — MCP (any AI client)

The MCP server exposes three tools to any MCP-compatible client — Claude Code, OpenClaw (ChatGPT / Gemini / any client), Cursor, Zed:

Tool	What it does
`investigate_failure`	Full agentic investigation → incident report
`list_failures`	List failing tests from run_results.json
`get_report`	Read a saved incident report

Claude Code:

claude mcp add -s user \
  -e GCP_PROJECT=your-project \
  -e BQ_LOCATION=asia-south1 \
  -e DBT_MANIFEST_PATH=/path/to/dbt_bank/target/manifest.json \
  -e DBT_RUN_RESULTS_PATH=/path/to/dbt_bank/target/run_results.json \
  -e GEMINI_API_KEY=your-key \
  dbt-investigator \
  -- /path/to/venv/bin/python /path/to/mcp_server.py

OpenClaw (ChatGPT, Gemini, or any other client):

openclaw mcp set dbt-investigator '{
  "command": "/path/to/venv/bin/python",
  "args": ["/path/to/mcp_server.py"],
  "cwd": "/path/to/data-quality-agent",
  "env": {
    "GCP_PROJECT": "your-project",
    "GEMINI_API_KEY": "your-key",
    "DBT_MANIFEST_PATH": "/path/to/manifest.json",
    "DBT_RUN_RESULTS_PATH": "/path/to/run_results.json"
  }
}'
openclaw mcp probe   # → dbt-investigator: 3 tools ✔

Agent tools

Tool	What the agent calls
`get_failing_rows`	Queries BigQuery for actual bad rows
`get_model_lineage`	Reads manifest.json for upstream/downstream
`get_model_sql`	Gets compiled SQL for any model
`get_column_profile`	null count, distinct count, min, max
`run_query`	Custom read-only BQ investigation
`get_source_freshness`	Checks staleness of source tables
`write_report`	Writes the final incident report

Safety wall: all BigQuery queries are read-only (SELECT/WITH only). DML/DDL rejected before execution.

Setup

git clone https://github.com/ARAVINDHRAJA123/data-quality-agent.git
cd data-quality-agent

python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# Auth
gcloud auth application-default login

# Set environment
export GCP_PROJECT=your-project
export BQ_LOCATION=asia-south1
export DBT_MANIFEST_PATH=/path/to/dbt_bank/target/manifest.json
export DBT_RUN_RESULTS_PATH=/path/to/dbt_bank/target/run_results.json

# LLM (pick one)
export GEMINI_API_KEY=your-key      # free
export ANTHROPIC_API_KEY=your-key  # paid

Generate the manifest first (from your dbt project):

cd /path/to/dbt_project && dbt compile
# manifest.json is now at target/manifest.json

Stack

Claude / Gemini — LLM provider (auto-detected, free Gemini supported)
BigQuery — data warehouse (GCP)
dbt manifest.json — lineage graph and compiled SQL
FastMCP — MCP server (any AI client)
Flask — webhook server (Airflow integration)
pytest — test suite

Project structure

data-quality-agent/
├── agent.py          ← agentic investigation loop (Claude + Gemini)
├── server.py         ← Flask webhook server
├── mcp_server.py     ← FastMCP server (any MCP client)
├── report.py         ← incident report formatter
├── tools/
│   ├── bq_tools.py   ← BigQuery: failing rows, queries, freshness
│   └── dbt_tools.py  ← manifest: lineage, SQL, test results
├── tests/
│   └── test_tools.py ← 11 unit tests (no BQ/LLM needed)
├── reports/          ← saved incident reports (markdown)
└── requirements.txt

This server cannot be installed

license - not found

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ARAVINDHRAJA123/data-quality-agent'

If you have feedback or need assistance with the MCP directory API, please join our Discord server