dbt_ls
List available models, tests, sources, and other resources within a dbt project to understand project structure, identify dependencies, and select resources for operations.
Instructions
List dbt resources. An AI agent should use this tool when it needs to discover available models, tests, sources, and other resources within a dbt project. This helps the agent understand the project structure, identify dependencies, and select specific resources for other operations like running or testing.
Returns:
When output_format is 'json' (default):
- With verbose=False (default): returns a simplified JSON with only name, resource_type, and depends_on.nodes
- With verbose=True: returns a full JSON with all resource details
When output_format is 'name', 'path', or 'selector', returns plain text with the respective format.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| models | No | Specific models to list, using the dbt selection syntax. Note that you probably want to specify your selection here e.g. silver.fact | |
| selector | No | Named selector to use | |
| exclude | No | Models to exclude | |
| resource_type | No | Type of resource to list (model, test, source, etc.) | |
| project_dir | No | ABSOLUTE PATH to the directory containing the dbt project (e.g. '/Users/username/projects/dbt_project' not '.') | . |
| profiles_dir | No | Directory containing the profiles.yml file (defaults to project_dir if not specified) | |
| output_format | No | Output format (json, name, path, or selector) | json |
| verbose | No | Return full JSON output instead of simplified version |
Implementation Reference
- src/tools.py:134-210 (handler)The core handler function for the 'dbt_ls' tool, decorated with @mcp.tool(). It constructs the 'dbt ls' command based on input parameters, executes it via execute_dbt_command, and formats the output using ls_formatter via process_command_result.@mcp.tool() async def dbt_ls( models: Optional[str] = Field( default=None, description="Specific models to list, using the dbt selection syntax. Note that you probably want to specify your selection here e.g. silver.fact" ), selector: Optional[str] = Field( default=None, description="Named selector to use" ), exclude: Optional[str] = Field( default=None, description="Models to exclude" ), resource_type: Optional[str] = Field( default=None, description="Type of resource to list (model, test, source, etc.)" ), project_dir: str = Field( default=".", description="ABSOLUTE PATH to the directory containing the dbt project (e.g. '/Users/username/projects/dbt_project' not '.')" ), profiles_dir: Optional[str] = Field( default=None, description="Directory containing the profiles.yml file (defaults to project_dir if not specified)" ), output_format: str = Field( default="json", description="Output format (json, name, path, or selector)" ), verbose: bool = Field( default=False, description="Return full JSON output instead of simplified version" ) ) -> str: """List dbt resources. An AI agent should use this tool when it needs to discover available models, tests, sources, and other resources within a dbt project. This helps the agent understand the project structure, identify dependencies, and select specific resources for other operations like running or testing. Returns: When output_format is 'json' (default): - With verbose=False (default): returns a simplified JSON with only name, resource_type, and depends_on.nodes - With verbose=True: returns a full JSON with all resource details When output_format is 'name', 'path', or 'selector', returns plain text with the respective format. """ # Log diagnostic information logger.info(f"Starting dbt_ls with project_dir={project_dir}, output_format={output_format}") command = ["ls"] if models: command.extend(["-s", models]) if selector: command.extend(["--selector", selector]) if exclude: command.extend(["--exclude", exclude]) if resource_type: command.extend(["--resource-type", resource_type]) command.extend(["--output", output_format]) command.extend(["--quiet"]) logger.info(f"Executing dbt command: dbt {' '.join(command)}") result = await execute_dbt_command(command, project_dir, profiles_dir) logger.info(f"dbt command result: success={result['success']}, returncode={result.get('returncode')}") # Use the centralized result processor with ls_formatter formatter = partial(ls_formatter, output_format=output_format, verbose=verbose) return await process_command_result( result, command_name="ls", output_formatter=formatter, include_debug_info=True # Include extra debug info for this command )
- src/server.py:88-89 (registration)The call to register_tools(mcp) which registers the dbt_ls tool (along with others) with the FastMCP server instance.# Register tools register_tools(mcp)
- src/formatters.py:32-90 (helper)The ls_formatter function used specifically by dbt_ls to format the output from 'dbt ls', handling JSON parsing, filtering, sorting, and simplification based on verbose flag.def ls_formatter(output: Any, output_format: str = "json", verbose: bool = False) -> str: """ Formatter for dbt ls command output. Args: output: The command output output_format: The output format (json, name, path, or selector) verbose: Whether to return full JSON output (True) or simplified version (False) Returns: Formatted output string """ # For name, path, or selector formats, return the raw output as string if output_format != "json": logger.info(f"Returning raw output as string for format: {output_format}") return str(output) # For json format, parse the output and return as JSON logger.info("Parsing dbt ls output as JSON") # Return raw output if it's an empty string or None if not output: logger.warning("dbt ls returned empty output") return "[]" # Parse the output parsed = parse_dbt_list_output(output) # Filter out any empty or non-model entries filtered_parsed = [item for item in parsed if isinstance(item, dict) and item.get("resource_type") in ["model", "seed", "test", "source", "snapshot"]] # Sort the results by resource_type and name for better readability filtered_parsed.sort(key=lambda x: (x.get("resource_type", ""), x.get("name", ""))) # Return full parsed output if filtering removed everything if not filtered_parsed and parsed: logger.warning("Filtering removed all items, returning original parsed output") json_output = json.dumps(parsed, indent=2) logger.info(f"Final JSON output length: {len(json_output)}") return json_output # If not verbose, simplify the output to only include name, resource_type, and depends_on.nodes if not verbose and filtered_parsed: logger.info("Simplifying output (verbose=False)") simplified = [] for item in filtered_parsed: simplified.append({ "name": item.get("name"), "resource_type": item.get("resource_type"), "depends_on": { "nodes": item.get("depends_on", {}).get("nodes", []) } }) filtered_parsed = simplified json_output = json.dumps(filtered_parsed, indent=2) logger.info(f"Final JSON output length: {len(json_output)}") return json_output
- src/command.py:191-316 (helper)Helper function parse_dbt_list_output used by ls_formatter to parse various formats of dbt ls output into a standardized list of resource dictionaries.def parse_dbt_list_output(output: Union[str, Dict, List]) -> List[Dict[str, Any]]: """ Parse the output from dbt list command. Args: output: Output from dbt list command (string or parsed JSON) Returns: List of resources """ logger.debug(f"Parsing dbt list output with type: {type(output)}") # If already parsed as JSON dictionary with nodes if isinstance(output, dict) and "nodes" in output: return [ {"name": name, **details} for name, details in output["nodes"].items() ] # Handle dbt Cloud CLI output format - an array of objects with name property containing embedded JSON if isinstance(output, list) and all(isinstance(item, dict) and "name" in item for item in output): logger.debug(f"Found dbt Cloud CLI output format with {len(output)} items") extracted_models = [] for item in output: name_value = item["name"] # Skip log messages that don't contain model data if any(log_msg in name_value for log_msg in [ "Sending project", "Created invocation", "Waiting for", "Streaming", "Running dbt", "Invocation has finished" ]): continue # Check if the name value is a JSON string if name_value.startswith('{') and '"name":' in name_value and '"resource_type":' in name_value: try: # Parse the JSON string directly model_data = json.loads(name_value) if isinstance(model_data, dict) and "name" in model_data and "resource_type" in model_data: extracted_models.append(model_data) continue except json.JSONDecodeError: logger.debug(f"Failed to parse JSON from: {name_value[:30]}...") # Extract model data from timestamped JSON lines (e.g., "00:59:06 {json}") timestamp_prefix_match = re.match(r'^(\d\d:\d\d:\d\d)\s+(.+)$', name_value) if timestamp_prefix_match: json_string = timestamp_prefix_match.group(2) try: model_data = json.loads(json_string) if isinstance(model_data, dict): # Only add entries that have both name and resource_type if "name" in model_data and "resource_type" in model_data: extracted_models.append(model_data) except json.JSONDecodeError: # Not valid JSON, skip this line logger.debug(f"Failed to parse JSON from: {json_string[:30]}...") continue # If we found model data, return it if extracted_models: logger.debug(f"Successfully extracted {len(extracted_models)} models from dbt Cloud CLI output") return extracted_models # If no model data found, return empty list logger.warning("No valid model data found in dbt Cloud CLI output") return [] # If already parsed as regular JSON list if isinstance(output, list): # For test compatibility if all(isinstance(item, dict) and "name" in item for item in output): return output # For empty lists or other list types, return as is return output # If string, try to parse as JSON if isinstance(output, str): try: parsed = json.loads(output) if isinstance(parsed, dict) and "nodes" in parsed: return [ {"name": name, **details} for name, details in parsed["nodes"].items() ] elif isinstance(parsed, list): return parsed except json.JSONDecodeError: # Not JSON, parse text format (simplified) models = [] for line in output.splitlines(): line = line.strip() if not line: continue # Check if the line is a JSON string if line.startswith('{') and '"name":' in line and '"resource_type":' in line: try: model_data = json.loads(line) if isinstance(model_data, dict) and "name" in model_data and "resource_type" in model_data: models.append(model_data) continue except json.JSONDecodeError: pass # Check for dbt Cloud CLI format with timestamps (e.g., "00:59:06 {json}") timestamp_match = re.match(r'^(\d\d:\d\d:\d\d)\s+(.+)$', line) if timestamp_match: json_part = timestamp_match.group(2) try: model_data = json.loads(json_part) if isinstance(model_data, dict) and "name" in model_data and "resource_type" in model_data: models.append(model_data) continue except json.JSONDecodeError: pass # Fall back to simple name-only format models.append({"name": line}) return models # Fallback: return empty list logger.warning("Could not parse dbt list output in any recognized format") return []
- src/command.py:318-382 (helper)Centralized result processor called by dbt_ls handler to format the command result using the provided ls_formatter.async def process_command_result( result: Dict[str, Any], command_name: str, output_formatter: Optional[Callable] = None, include_debug_info: bool = False ) -> str: """ Process the result of a dbt command execution. Args: result: The result dictionary from execute_dbt_command command_name: The name of the dbt command (e.g. "run", "test") output_formatter: Optional function to format successful output include_debug_info: Whether to include additional debug info in error messages Returns: Formatted output or error message """ logger.info(f"Processing command result for {command_name}") logger.info(f"Result success: {result['success']}, returncode: {result.get('returncode')}") # Log the output type and a sample if "output" in result: if isinstance(result["output"], str): logger.info(f"Output type: str, first 100 chars: {result['output'][:100]}") elif isinstance(result["output"], (dict, list)): logger.info(f"Output type: {type(result['output'])}, sample: {json.dumps(result['output'])[:100]}") else: logger.info(f"Output type: {type(result['output'])}") # For errors, simply return the raw command output if available if not result["success"]: logger.warning(f"Command {command_name} failed with returncode {result.get('returncode')}") # If we have command output, return it directly if "output" in result and result["output"]: logger.info(f"Returning error output: {str(result['output'])[:100]}...") return str(result["output"]) # If no command output, return the error message if result["error"]: logger.info(f"Returning error message: {str(result['error'])[:100]}...") return str(result["error"]) # If neither output nor error is available, return a generic message logger.info("No output or error available, returning generic message") return f"Command failed with exit code {result.get('returncode', 'unknown')}" # Format successful output if output_formatter: logger.info(f"Using custom formatter for {command_name}") formatted_result = output_formatter(result["output"]) logger.info(f"Formatted result type: {type(formatted_result)}, first 100 chars: {str(formatted_result)[:100]}") return formatted_result # Default output formatting logger.info(f"Using default formatting for {command_name}") if isinstance(result["output"], (dict, list)): json_result = json.dumps(result["output"]) logger.info(f"JSON result length: {len(json_result)}, first 100 chars: {json_result[:100]}") return json_result else: str_result = str(result["output"]) logger.info(f"String result length: {len(str_result)}, first 100 chars: {str_result[:100]}") return str_result