Skip to main content
Glama
NiclasOlofsson

DBT Core MCP Server

analyze_impact

Analyze downstream dependencies affected by changes to any dbt resource, with auto-detection and actionable recommendations for running impacted models, tests, and other resources.

Instructions

Analyze the impact of changing any dbt resource with auto-detection.

This unified tool works across all resource types (models, sources, seeds, snapshots, etc.) showing all downstream dependencies that would be affected by changes. Provides actionable recommendations for running affected resources.

Args: name: Resource name. For sources, use "source_name.table_name" or just "table_name" Examples: "stg_customers", "jaffle_shop.orders", "raw_customers" resource_type: Optional filter to narrow search: - "model": Data transformation models - "source": External data sources - "seed": CSV reference data files - "snapshot": SCD Type 2 historical tables - "test": Data quality tests - "analysis": Ad-hoc analysis queries - None: Auto-detect (searches all types)

Returns: Impact analysis with: - List of affected models by distance - Count of affected tests and other resources - Total impact statistics - Resources grouped by distance from changed resource - Recommended dbt command to run affected resources - Human-readable impact assessment message If multiple matches found, returns all matches for LLM to process.

Raises: ValueError: If resource not found

Examples: analyze_impact("stg_customers") -> auto-detect and show impact analyze_impact("jaffle_shop.orders", "source") -> impact of source change analyze_impact("raw_customers", "seed") -> impact of seed data change

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
nameYes
resource_typeNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The _implementation function that executes the analyze_impact tool logic. It ensures state is initialized, then delegates to state.manifest.analyze_impact(name, resource_type).
    async def _implementation(ctx: Context | None, name: str, resource_type: str | None, state: DbtCoreServerContext, force_parse: bool = True) -> dict[str, Any]:
        """Implementation function for analyze_impact tool.
    
        Separated for testing purposes - tests call this directly with explicit state.
        The @tool() decorated analyze_impact() function calls this with injected dependencies.
        """
        # Initialize state if needed (metadata tool uses force_parse=True)
        await state.ensure_initialized(ctx, force_parse)
    
        # Delegate to manifest helper for downstream impact calculation
        try:
            return state.manifest.analyze_impact(name, resource_type)  # type: ignore
        except ValueError as e:
            raise ValueError(f"Impact analysis error: {e}")
  • The @dbtTool() decorated analyze_impact function with full schema: parameter definitions, return type documentation, and examples. This is the MCP tool entry point.
    @dbtTool()
    async def analyze_impact(
        ctx: Context,
        name: str,
        resource_type: str | None = None,
        state: DbtCoreServerContext = Depends(get_state),
    ) -> dict[str, Any]:
        """Analyze the impact of changing any dbt resource with auto-detection.
    
        This unified tool works across all resource types (models, sources, seeds, snapshots, etc.)
        showing all downstream dependencies that would be affected by changes. Provides actionable
        recommendations for running affected resources.
    
        Args:
            name: Resource name. For sources, use "source_name.table_name" or just "table_name"
                Examples: "stg_customers", "jaffle_shop.orders", "raw_customers"
            resource_type: Optional filter to narrow search:
                - "model": Data transformation models
                - "source": External data sources
                - "seed": CSV reference data files
                - "snapshot": SCD Type 2 historical tables
                - "test": Data quality tests
                - "analysis": Ad-hoc analysis queries
                - None: Auto-detect (searches all types)
    
        Returns:
            Impact analysis with:
            - List of affected models by distance
            - Count of affected tests and other resources
            - Total impact statistics
            - Resources grouped by distance from changed resource
            - Recommended dbt command to run affected resources
            - Human-readable impact assessment message
            If multiple matches found, returns all matches for LLM to process.
    
        Raises:
            ValueError: If resource not found
    
        Examples:
            analyze_impact("stg_customers") -> auto-detect and show impact
            analyze_impact("jaffle_shop.orders", "source") -> impact of source change
            analyze_impact("raw_customers", "seed") -> impact of seed data change
        """
        return await _implementation(ctx, name, resource_type, state)
  • The ManifestLoader.analyze_impact() method which performs the actual impact analysis: finds downstream dependents, categorizes by type (model/test/other), groups by distance, builds recommendations, and returns the structured result.
    def analyze_impact(
        self,
        name: str,
        resource_type: str | None = None,
    ) -> dict[str, Any]:
        """
        Analyze the impact of changing a resource across all resource types.
    
        Shows all downstream dependencies that would be affected by changes,
        including models, tests, and other resources. Provides actionable
        recommendations for running affected resources.
    
        Args:
            name: Resource name. For sources, use "source_name.table_name" or just "table_name"
            resource_type: Optional filter (model, source, seed, snapshot, test, analysis).
                          If None, auto-detects resource type.
    
        Returns:
            Dictionary with impact analysis:
            {
                "resource": {...},  # The target resource info
                "impact": {
                    "models_affected": [...],  # Downstream models by distance
                    "models_affected_count": int,
                    "tests_affected_count": int,
                    "other_affected_count": int,
                    "total_affected": int
                },
                "affected_by_distance": {
                    "1": [...],  # Immediate dependents
                    "2": [...],  # Second-level dependents
                    ...
                },
                "recommendation": str,  # Suggested dbt command
                "message": str  # Human-readable impact assessment
            }
    
            If multiple matches found, returns:
            {"multiple_matches": True, "matches": [...], "message": "..."}
    
        Raises:
            RuntimeError: If manifest not loaded
            ValueError: If resource not found
    
        Examples:
            analyze_impact("stg_customers") -> impact of changing staging model
            analyze_impact("jaffle_shop.orders", "source") -> impact of source change
            analyze_impact("raw_customers", "seed") -> impact of seed change
        """
        if not self._manifest:
            raise RuntimeError("Manifest not loaded. Call load() first.")
    
        # Get the resource (auto-detect if resource_type not specified)
        resource = self.get_resource_node(name, resource_type)
    
        # Handle multiple matches - return for LLM to process
        if resource.get("multiple_matches"):
            return resource
    
        # Extract unique_id for impact traversal
        unique_id = resource.get("unique_id")
        if not unique_id:
            raise ValueError(f"Resource '{name}' does not have a unique_id")
    
        # Get all downstream dependencies (no depth limit for impact)
        downstream = self.get_downstream_nodes(unique_id, max_depth=None)
    
        # Categorize by resource type
        models_affected: list[dict[str, Any]] = []
        tests_affected: list[dict[str, Any]] = []
        other_affected: list[dict[str, Any]] = []
        affected_by_distance: dict[str, list[dict[str, Any]]] = {}
    
        for dep in downstream:
            dep_type = str(dep["type"])
            distance = str(dep["distance"])
    
            # Group by distance
            if distance not in affected_by_distance:
                affected_by_distance[distance] = []
            affected_by_distance[distance].append(dep)
    
            # Categorize by type
            if dep_type == "model":
                models_affected.append(dep)
            elif dep_type == "test":
                tests_affected.append(dep)
            else:
                other_affected.append(dep)
    
        # Sort models by distance for better readability
        models_affected_sorted = sorted(models_affected, key=lambda x: (int(x["distance"]), str(x["name"])))
    
        # Build recommendation based on resource type
        resource_name = resource.get("name", name)
        current_resource_type = resource.get("resource_type")
    
        if current_resource_type == "source":
            # For sources, recommend running downstream models
            if len(models_affected) == 0:
                recommendation = f"dbt test -s source:{resource.get('source_name')}.{resource_name}"
            else:
                recommendation = f"dbt run -s {resource_name}+"
        elif current_resource_type == "seed":
            # For seeds, recommend seeding + downstream
            if len(models_affected) == 0:
                recommendation = f"dbt seed -s {resource_name} && dbt test -s {resource_name}"
            else:
                recommendation = f"dbt seed -s {resource_name} && dbt run -s {resource_name}+"
        else:
            # For models, snapshots, etc.
            if len(models_affected) == 0:
                recommendation = f"dbt run -s {resource_name}"
            else:
                recommendation = f"dbt run -s {resource_name}+"
    
        # Build result
        result: dict[str, Any] = {
            "resource": {
                "name": resource_name,
                "unique_id": unique_id,
                "resource_type": current_resource_type,
                "package_name": resource.get("package_name"),
            },
            "impact": {
                "models_affected": models_affected_sorted,
                "models_affected_count": len(models_affected),
                "tests_affected_count": len(tests_affected),
                "other_affected_count": len(other_affected),
                "total_affected": len(downstream),
            },
            "affected_by_distance": affected_by_distance,
            "recommendation": recommendation,
        }
    
        # Add helpful message based on impact size
        if len(models_affected) == 0:
            result["message"] = "No downstream models affected. Only this resource needs to be run/tested."
        elif len(models_affected) <= 3:
            result["message"] = f"Low impact: {len(models_affected)} downstream model(s) affected."
        elif len(models_affected) <= 10:
            result["message"] = f"Medium impact: {len(models_affected)} downstream models affected."
        else:
            result["message"] = f"High impact: {len(models_affected)} downstream models affected. Consider incremental changes."
    
        return result
  • Tools are dynamically registered in _register_tools() via discover_tools_in_package which scans all modules in dbt_core_mcp.tools for @dbtTool()-decorated functions.
    def _register_tools(self) -> None:
        """Dynamically register all dbt Core MCP tools."""
        from .tools import discover_tools_in_package, get_tool_metadata
    
        tool_functions = discover_tools_in_package("dbt_core_mcp.tools")
        for tool_func in tool_functions:
            metadata = get_tool_metadata(tool_func, default=None)
            if metadata:
                allowed_keys = {
                    "name",
                    "description",
                    "tags",
                    "enabled",
                    "icons",
                    "annotations",
                    "meta",
                }
                tool_kwargs = {key: value for key, value in metadata.items() if key in allowed_keys}
                self.app.tool(**tool_kwargs)(tool_func)
                logger.info("Registered tool metadata for %s: %s", tool_func.__name__, metadata)
            else:
                self.app.tool()(tool_func)
  • The dbtTool() decorator used to mark the analyze_impact function as a discoverable MCP tool.
    def dbtTool(**metadata: Any) -> Callable[[F], F]:
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description fully discloses behavioral traits: it shows downstream dependencies, provides recommendations, returns all matches for LLM to process if multiple found, raises ValueError, and explains resource_type filtering.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (Args, Returns, Raises, Examples) and no wasted words. It is appropriately sized for the complexity of the tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the output schema exists and the description explains the return structure in detail, the description is fully complete for an impact analysis tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, so the description carries full burden. It provides detailed parameter semantics: name format (with source_name.table_name examples) and resource_type options with descriptions, adding significant meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes the impact of changing dbt resources with auto-detection, and it distinguishes from sibling tools like get_lineage or get_resource_info by focusing on downstream dependencies and actionable recommendations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use this tool (when changing dbt resources) and provides examples, but it does not explicitly mention when not to use it or compare to alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/NiclasOlofsson/dbt-core-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server