Skip to main content
Glama

dbt_ls

List available models, tests, sources, and other resources within a dbt project to understand project structure, identify dependencies, and select resources for operations.

Instructions

List dbt resources. An AI agent should use this tool when it needs to discover available models, tests, sources, and other resources within a dbt project. This helps the agent understand the project structure, identify dependencies, and select specific resources for other operations like running or testing.

    Returns:
        When output_format is 'json' (default):
          - With verbose=False (default): returns a simplified JSON with only name, resource_type, and depends_on.nodes
          - With verbose=True: returns a full JSON with all resource details
        When output_format is 'name', 'path', or 'selector', returns plain text with the respective format.
    

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
modelsNoSpecific models to list, using the dbt selection syntax. Note that you probably want to specify your selection here e.g. silver.fact
selectorNoNamed selector to use
excludeNoModels to exclude
resource_typeNoType of resource to list (model, test, source, etc.)
project_dirNoABSOLUTE PATH to the directory containing the dbt project (e.g. '/Users/username/projects/dbt_project' not '.').
profiles_dirNoDirectory containing the profiles.yml file (defaults to project_dir if not specified)
output_formatNoOutput format (json, name, path, or selector)json
verboseNoReturn full JSON output instead of simplified version

Implementation Reference

  • The core handler function for the 'dbt_ls' tool, decorated with @mcp.tool(). It constructs the 'dbt ls' command based on input parameters, executes it via execute_dbt_command, and formats the output using ls_formatter via process_command_result.
    @mcp.tool()
    async def dbt_ls(
        models: Optional[str] = Field(
            default=None,
            description="Specific models to list, using the dbt selection syntax. Note that you probably want to specify your selection here e.g. silver.fact"
        ),
        selector: Optional[str] = Field(
            default=None,
            description="Named selector to use"
        ),
        exclude: Optional[str] = Field(
            default=None,
            description="Models to exclude"
        ),
        resource_type: Optional[str] = Field(
            default=None,
            description="Type of resource to list (model, test, source, etc.)"
        ),
        project_dir: str = Field(
            default=".",
            description="ABSOLUTE PATH to the directory containing the dbt project (e.g. '/Users/username/projects/dbt_project' not '.')"
        ),
        profiles_dir: Optional[str] = Field(
            default=None,
            description="Directory containing the profiles.yml file (defaults to project_dir if not specified)"
        ),
        output_format: str = Field(
            default="json",
            description="Output format (json, name, path, or selector)"
        ),
        verbose: bool = Field(
            default=False,
            description="Return full JSON output instead of simplified version"
        )
    ) -> str:
        """List dbt resources. An AI agent should use this tool when it needs to discover available models, tests, sources, and other resources within a dbt project. This helps the agent understand the project structure, identify dependencies, and select specific resources for other operations like running or testing.
    
        Returns:
            When output_format is 'json' (default):
              - With verbose=False (default): returns a simplified JSON with only name, resource_type, and depends_on.nodes
              - With verbose=True: returns a full JSON with all resource details
            When output_format is 'name', 'path', or 'selector', returns plain text with the respective format.
        """
        # Log diagnostic information
        logger.info(f"Starting dbt_ls with project_dir={project_dir}, output_format={output_format}")
    
        command = ["ls"]
    
        if models:
            command.extend(["-s", models])
    
        if selector:
            command.extend(["--selector", selector])
    
        if exclude:
            command.extend(["--exclude", exclude])
    
        if resource_type:
            command.extend(["--resource-type", resource_type])
    
        command.extend(["--output", output_format])
    
        command.extend(["--quiet"])
    
        logger.info(f"Executing dbt command: dbt {' '.join(command)}")
        result = await execute_dbt_command(command, project_dir, profiles_dir)
        logger.info(f"dbt command result: success={result['success']}, returncode={result.get('returncode')}")
    
        # Use the centralized result processor with ls_formatter
        formatter = partial(ls_formatter, output_format=output_format, verbose=verbose)
    
        return await process_command_result(
            result,
            command_name="ls",
            output_formatter=formatter,
            include_debug_info=True  # Include extra debug info for this command
        )
  • src/server.py:88-89 (registration)
    The call to register_tools(mcp) which registers the dbt_ls tool (along with others) with the FastMCP server instance.
    # Register tools
    register_tools(mcp)
  • The ls_formatter function used specifically by dbt_ls to format the output from 'dbt ls', handling JSON parsing, filtering, sorting, and simplification based on verbose flag.
    def ls_formatter(output: Any, output_format: str = "json", verbose: bool = False) -> str:
        """
        Formatter for dbt ls command output.
    
        Args:
            output: The command output
            output_format: The output format (json, name, path, or selector)
            verbose: Whether to return full JSON output (True) or simplified version (False)
    
        Returns:
            Formatted output string
        """
        # For name, path, or selector formats, return the raw output as string
        if output_format != "json":
            logger.info(f"Returning raw output as string for format: {output_format}")
            return str(output)
    
        # For json format, parse the output and return as JSON
        logger.info("Parsing dbt ls output as JSON")
    
        # Return raw output if it's an empty string or None
        if not output:
            logger.warning("dbt ls returned empty output")
            return "[]"
    
        # Parse the output
        parsed = parse_dbt_list_output(output)
    
        # Filter out any empty or non-model entries
        filtered_parsed = [item for item in parsed if isinstance(item, dict) and
                          item.get("resource_type") in ["model", "seed", "test", "source", "snapshot"]]
    
        # Sort the results by resource_type and name for better readability
        filtered_parsed.sort(key=lambda x: (x.get("resource_type", ""), x.get("name", "")))
    
        # Return full parsed output if filtering removed everything
        if not filtered_parsed and parsed:
            logger.warning("Filtering removed all items, returning original parsed output")
            json_output = json.dumps(parsed, indent=2)
            logger.info(f"Final JSON output length: {len(json_output)}")
            return json_output
    
        # If not verbose, simplify the output to only include name, resource_type, and depends_on.nodes
        if not verbose and filtered_parsed:
            logger.info("Simplifying output (verbose=False)")
            simplified = []
            for item in filtered_parsed:
                simplified.append({
                    "name": item.get("name"),
                    "resource_type": item.get("resource_type"),
                    "depends_on": {
                        "nodes": item.get("depends_on", {}).get("nodes", [])
                    }
                })
            filtered_parsed = simplified
    
        json_output = json.dumps(filtered_parsed, indent=2)
        logger.info(f"Final JSON output length: {len(json_output)}")
        return json_output
  • Helper function parse_dbt_list_output used by ls_formatter to parse various formats of dbt ls output into a standardized list of resource dictionaries.
    def parse_dbt_list_output(output: Union[str, Dict, List]) -> List[Dict[str, Any]]:
        """
        Parse the output from dbt list command.
        
        Args:
            output: Output from dbt list command (string or parsed JSON)
            
        Returns:
            List of resources
        """
        logger.debug(f"Parsing dbt list output with type: {type(output)}")
        
        # If already parsed as JSON dictionary with nodes
        if isinstance(output, dict) and "nodes" in output:
            return [
                {"name": name, **details}
                for name, details in output["nodes"].items()
            ]
        
        # Handle dbt Cloud CLI output format - an array of objects with name property containing embedded JSON
        if isinstance(output, list) and all(isinstance(item, dict) and "name" in item for item in output):
            logger.debug(f"Found dbt Cloud CLI output format with {len(output)} items")
            extracted_models = []
            
            for item in output:
                name_value = item["name"]
                
                # Skip log messages that don't contain model data
                if any(log_msg in name_value for log_msg in [
                    "Sending project", "Created invocation", "Waiting for",
                    "Streaming", "Running dbt", "Invocation has finished"
                ]):
                    continue
                
                # Check if the name value is a JSON string
                if name_value.startswith('{') and '"name":' in name_value and '"resource_type":' in name_value:
                    try:
                        # Parse the JSON string directly
                        model_data = json.loads(name_value)
                        if isinstance(model_data, dict) and "name" in model_data and "resource_type" in model_data:
                            extracted_models.append(model_data)
                            continue
                    except json.JSONDecodeError:
                        logger.debug(f"Failed to parse JSON from: {name_value[:30]}...")
                
                # Extract model data from timestamped JSON lines (e.g., "00:59:06 {json}")
                timestamp_prefix_match = re.match(r'^(\d\d:\d\d:\d\d)\s+(.+)$', name_value)
                if timestamp_prefix_match:
                    json_string = timestamp_prefix_match.group(2)
                    try:
                        model_data = json.loads(json_string)
                        if isinstance(model_data, dict):
                            # Only add entries that have both name and resource_type
                            if "name" in model_data and "resource_type" in model_data:
                                extracted_models.append(model_data)
                    except json.JSONDecodeError:
                        # Not valid JSON, skip this line
                        logger.debug(f"Failed to parse JSON from: {json_string[:30]}...")
                        continue
            
            # If we found model data, return it
            if extracted_models:
                logger.debug(f"Successfully extracted {len(extracted_models)} models from dbt Cloud CLI output")
                return extracted_models
            
            # If no model data found, return empty list
            logger.warning("No valid model data found in dbt Cloud CLI output")
            return []
        
        # If already parsed as regular JSON list
        if isinstance(output, list):
            # For test compatibility
            if all(isinstance(item, dict) and "name" in item for item in output):
                return output
            # For empty lists or other list types, return as is
            return output
        
        # If string, try to parse as JSON
        if isinstance(output, str):
            try:
                parsed = json.loads(output)
                if isinstance(parsed, dict) and "nodes" in parsed:
                    return [
                        {"name": name, **details}
                        for name, details in parsed["nodes"].items()
                    ]
                elif isinstance(parsed, list):
                    return parsed
            except json.JSONDecodeError:
                # Not JSON, parse text format (simplified)
                models = []
                for line in output.splitlines():
                    line = line.strip()
                    if not line:
                        continue
                        
                    # Check if the line is a JSON string
                    if line.startswith('{') and '"name":' in line and '"resource_type":' in line:
                        try:
                            model_data = json.loads(line)
                            if isinstance(model_data, dict) and "name" in model_data and "resource_type" in model_data:
                                models.append(model_data)
                                continue
                        except json.JSONDecodeError:
                            pass
                    
                    # Check for dbt Cloud CLI format with timestamps (e.g., "00:59:06 {json}")
                    timestamp_match = re.match(r'^(\d\d:\d\d:\d\d)\s+(.+)$', line)
                    if timestamp_match:
                        json_part = timestamp_match.group(2)
                        try:
                            model_data = json.loads(json_part)
                            if isinstance(model_data, dict) and "name" in model_data and "resource_type" in model_data:
                                models.append(model_data)
                                continue
                        except json.JSONDecodeError:
                            pass
                    
                    # Fall back to simple name-only format
                    models.append({"name": line})
                return models
        
        # Fallback: return empty list
        logger.warning("Could not parse dbt list output in any recognized format")
        return []
  • Centralized result processor called by dbt_ls handler to format the command result using the provided ls_formatter.
    async def process_command_result(
        result: Dict[str, Any],
        command_name: str,
        output_formatter: Optional[Callable] = None,
        include_debug_info: bool = False
    ) -> str:
        """
        Process the result of a dbt command execution.
        
        Args:
            result: The result dictionary from execute_dbt_command
            command_name: The name of the dbt command (e.g. "run", "test")
            output_formatter: Optional function to format successful output
            include_debug_info: Whether to include additional debug info in error messages
            
        Returns:
            Formatted output or error message
        """
        logger.info(f"Processing command result for {command_name}")
        logger.info(f"Result success: {result['success']}, returncode: {result.get('returncode')}")
        
        # Log the output type and a sample
        if "output" in result:
            if isinstance(result["output"], str):
                logger.info(f"Output type: str, first 100 chars: {result['output'][:100]}")
            elif isinstance(result["output"], (dict, list)):
                logger.info(f"Output type: {type(result['output'])}, sample: {json.dumps(result['output'])[:100]}")
            else:
                logger.info(f"Output type: {type(result['output'])}")
        
        # For errors, simply return the raw command output if available
        if not result["success"]:
            logger.warning(f"Command {command_name} failed with returncode {result.get('returncode')}")
            
            # If we have command output, return it directly
            if "output" in result and result["output"]:
                logger.info(f"Returning error output: {str(result['output'])[:100]}...")
                return str(result["output"])
            
            # If no command output, return the error message
            if result["error"]:
                logger.info(f"Returning error message: {str(result['error'])[:100]}...")
                return str(result["error"])
                
            # If neither output nor error is available, return a generic message
            logger.info("No output or error available, returning generic message")
            return f"Command failed with exit code {result.get('returncode', 'unknown')}"
        
        # Format successful output
        if output_formatter:
            logger.info(f"Using custom formatter for {command_name}")
            formatted_result = output_formatter(result["output"])
            logger.info(f"Formatted result type: {type(formatted_result)}, first 100 chars: {str(formatted_result)[:100]}")
            return formatted_result
        
        # Default output formatting
        logger.info(f"Using default formatting for {command_name}")
        if isinstance(result["output"], (dict, list)):
            json_result = json.dumps(result["output"])
            logger.info(f"JSON result length: {len(json_result)}, first 100 chars: {json_result[:100]}")
            return json_result
        else:
            str_result = str(result["output"])
            logger.info(f"String result length: {len(str_result)}, first 100 chars: {str_result[:100]}")
            return str_result

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MammothGrowth/dbt-cli-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server