Skip to main content
Glama

diagnose_dag_failure

Automatically diagnose failed Airflow DAG runs by identifying failed tasks, analyzing logs, extracting EMR application IDs, and providing root cause analysis in one step.

Instructions

One-shot diagnosis of a failed DAG run.

This tool does everything automatically:

  1. Finds the most recent failed run (today or specified date)

  2. Identifies which task(s) failed

  3. Reads the failed task logs

  4. Extracts EMR application ID from the 'initialise' task

  5. Reads the Spark driver logs (stdout) for Python app errors

  6. Returns a complete failure analysis

This replaces the need to call 5-6 tools manually.

Args: dag_id: The DAG to diagnose (e.g. 'ttdcustom_processing'). env: Target environment — 'dev', 'uat', 'test', or 'prod'. IMPORTANT: Do NOT guess or default. Ask the user which environment if not specified. date: Optional date to check (ISO format or 'yesterday'). Default: today.

Returns a comprehensive failure report with root cause analysis.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
dag_idYes
envNo
dateNo

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/SrujanReddyKallu2024/MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server