get_task_log
Retrieve Airflow task logs to diagnose pipeline failures and extract EMR Serverless application or job IDs for debugging.
Instructions
Read the Airflow log for a specific task attempt.
IMPORTANT for EMR debugging:
Read the 'initialise' (or 'ae_initialize_emr_application') task log to find the EMR application ID. Look for: 'EMR serverless application created: 00gXXXXXXXXX' or 'Created EMR application 00gXXXXXXXXX'.
Read the failed processing task log to find the job_run_id. Look for: 'EMR serverless job started: 00gXXXXXXXXX'.
Then use read_spark_driver_log(application_id, job_run_id, log_type='stdout') to get the Python app output.
Args: dag_id: The DAG identifier. dag_run_id: The run ID. task_id: The task identifier. env: Target environment — 'dev', 'uat', 'test', or 'prod'. IMPORTANT: Do NOT guess or default. Ask the user which environment if not specified. try_number: Which attempt (default 1). tail_lines: Number of lines to return from the end (default 200).
Returns the raw log text, trimmed to the last N lines.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| dag_id | Yes | ||
| dag_run_id | Yes | ||
| task_id | Yes | ||
| env | No | ||
| try_number | No | ||
| tail_lines | No |