astro-airflow-mcp
OfficialServer Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| MCP_HOST | No | Host to bind to (HTTP mode only). | localhost |
| MCP_PORT | No | Port to bind to (HTTP mode only). | 8000 |
| MCP_TRANSPORT | No | Transport mode (stdio or http). | stdio |
| AIRFLOW_API_URL | No | Airflow webserver URL. Defaults to http://localhost:8080 or auto-discovery from .astro/config.yaml. | http://localhost:8080 |
| AIRFLOW_PASSWORD | No | Password for authentication. | |
| AIRFLOW_USERNAME | No | Username for authentication (Airflow 3.x uses OAuth2 token exchange). | |
| AIRFLOW_AUTH_TOKEN | No | Bearer token for authentication (alternative to username/password). | |
| AIRFLOW_PROJECT_DIR | No | Astro project directory for auto-discovering Airflow URL from .astro/config.yaml. | $PWD |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| get_dag_details | Get detailed information about a specific Apache Airflow DAG. Use this tool when the user asks about:
Returns complete DAG information including:
Args: dag_id: The ID of the DAG to get details for Returns: JSON with complete details about the specified DAG |
| list_dags | Get information about all Apache Airflow DAGs (Directed Acyclic Graphs). Use this tool when the user asks about:
Returns comprehensive DAG metadata including:
Returns: JSON with list of all DAGs and their complete metadata |
| get_dag_source | Get the source code for a specific Apache Airflow DAG. Use this tool when the user asks about:
Returns the DAG source file contents including:
Args: dag_id: The ID of the DAG to get source code for Returns: JSON with DAG source code and metadata |
| get_dag_stats | Get statistics about DAG runs (success/failure counts by state). Use this tool when the user asks about:
Returns statistics showing counts of DAG runs grouped by state:
Args: dag_ids: Optional list of DAG IDs to filter by. If not provided, returns stats for all DAGs. Returns: JSON with DAG run statistics organized by DAG and state |
| list_dag_warnings | Get warnings and issues detected in DAG definitions. Use this tool when the user asks about:
Returns warnings about DAG configuration issues including:
Returns: JSON with list of DAG warnings and their details |
| list_import_errors | Get import errors from DAG files that failed to parse or load. Use this tool when the user asks about:
Import errors occur when DAG files have problems that prevent Airflow from parsing them, such as:
Returns import error details including:
Returns: JSON with list of import errors and their stack traces |
| get_task | Get detailed information about a specific task definition in a DAG. Use this tool when the user asks about:
Returns task definition information including:
Args: dag_id: The ID of the DAG containing the task task_id: The ID of the task to get details for Returns: JSON with complete task definition details |
| list_tasks | Get all tasks defined in a specific DAG. Use this tool when the user asks about:
Returns information about all tasks in the DAG including:
Args: dag_id: The ID of the DAG to list tasks for Returns: JSON with list of all tasks in the DAG and their configurations |
| get_task_instance | Get detailed information about a specific task instance execution. Use this tool when the user asks about:
Returns detailed task instance information including:
Args: dag_id: The ID of the DAG dag_run_id: The ID of the DAG run (e.g., "manual__2024-01-01T00:00:00+00:00") task_id: The ID of the task within the DAG Returns: JSON with complete task instance details |
| get_task_logs | Get logs for a specific task instance execution. Use this tool when the user asks about:
Returns the actual log output from the task execution, which includes:
This is essential for debugging failed tasks or understanding what happened during task execution. Args: dag_id: The ID of the DAG (e.g., "example_dag") dag_run_id: The ID of the DAG run (e.g., "manual__2024-01-01T00:00:00+00:00") task_id: The ID of the task within the DAG (e.g., "extract_data") try_number: The task try/attempt number, 1-indexed (default: 1). Use higher numbers to get logs from retry attempts. map_index: For mapped tasks, which map index to get logs for. Use -1 for non-mapped tasks (default: -1). Returns: JSON with the task logs content |
| list_dag_runs | Get execution history and status of DAG runs (workflow executions). Use this tool when the user asks about:
Returns execution metadata including:
Returns: JSON with list of DAG runs across all DAGs, sorted by most recent |
| get_dag_run | Get detailed information about a specific DAG run execution. Use this tool when the user asks about:
Returns detailed information about a specific DAG run execution including:
Args: dag_id: The ID of the DAG (e.g., "example_dag") dag_run_id: The ID of the DAG run (e.g., "manual__2024-01-01T00:00:00+00:00") Returns: JSON with complete details about the specified DAG run |
| trigger_dag | Trigger a new DAG run (start a workflow execution manually). Use this tool when the user asks to:
This creates a new DAG run that will be picked up by the scheduler and executed.
You can optionally pass configuration parameters that will be available to the
DAG during execution via the IMPORTANT: This is a write operation that modifies Airflow state by creating a new DAG run. Use with caution. Returns information about the newly triggered DAG run including:
Args: dag_id: The ID of the DAG to trigger (e.g., "example_dag") conf: Optional configuration dictionary to pass to the DAG run. This will be available in the DAG via context['dag_run'].conf Returns: JSON with details about the newly triggered DAG run |
| trigger_dag_and_wait | Trigger a DAG run and wait for it to complete before returning. Use this tool when the user asks to:
This is a BLOCKING operation that will:
IMPORTANT: This tool blocks until the DAG completes or times out. For long-running
DAGs, consider using Default timeout is 60 minutes. Adjust the Returns information about the completed DAG run including:
Args: dag_id: The ID of the DAG to trigger (e.g., "example_dag") conf: Optional configuration dictionary to pass to the DAG run. This will be available in the DAG via context['dag_run'].conf timeout: Maximum time to wait in seconds (default: 3600.0 / 60 minutes) Returns: JSON with final DAG run status and any failed task details |
| pause_dag | Pause a DAG to prevent new scheduled runs from starting. Use this tool when the user asks to:
When a DAG is paused:
IMPORTANT: This is a write operation that modifies Airflow state. The DAG will remain paused until explicitly unpaused. Args: dag_id: The ID of the DAG to pause (e.g., "example_dag") Returns: JSON with updated DAG details showing is_paused=True |
| unpause_dag | Unpause a DAG to allow scheduled runs to resume. Use this tool when the user asks to:
When a DAG is unpaused:
IMPORTANT: This is a write operation that modifies Airflow state. New DAG runs will be scheduled according to the DAG's schedule_interval. Args: dag_id: The ID of the DAG to unpause (e.g., "example_dag") Returns: JSON with updated DAG details showing is_paused=False |
| list_assets | Get data assets and datasets tracked by Airflow (data lineage). Use this tool when the user asks about:
Assets represent datasets or files that DAGs produce or consume. This enables data-driven scheduling where DAGs wait for data availability. Returns asset information including:
Returns: JSON with list of all assets and their producing/consuming relationships |
| list_asset_events | List asset/dataset events with optional filtering. Use this tool when the user asks about:
Asset events are produced when a task updates an asset/dataset. These events can trigger downstream DAGs that depend on those assets (data-aware scheduling). Returns event information including:
Args: source_dag_id: Filter events by the DAG that produced them source_run_id: Filter events by the DAG run that produced them source_task_id: Filter events by the task that produced them limit: Maximum number of events to return (default: 100) Returns: JSON with list of asset events |
| get_upstream_asset_events | Get asset events that triggered a specific DAG run. Use this tool when the user asks about:
This is useful for understanding causation in data-aware scheduling. When a DAG is scheduled based on asset updates, this tool shows which specific asset events triggered the run. Returns information including:
Each event includes:
Args: dag_id: The ID of the DAG dag_run_id: The ID of the DAG run (e.g., "scheduled__2024-01-01T00:00:00+00:00") Returns: JSON with the asset events that triggered this DAG run |
| list_connections | Get connection configurations for external systems (databases, APIs, services). Use this tool when the user asks about:
Connections store credentials and connection info for external systems that DAGs interact with (databases, S3, APIs, etc.). Returns connection metadata including:
IMPORTANT: Passwords are NEVER returned for security reasons. Returns: JSON with list of all connections (credentials excluded) |
| get_pool | Get detailed information about a specific resource pool. Use this tool when the user asks about:
Pools are used to limit parallelism for specific sets of tasks. This returns detailed real-time information about a specific pool's capacity and utilization. Returns detailed pool information including:
Args: pool_name: The name of the pool to get details for (e.g., "default_pool") Returns: JSON with complete details about the specified pool |
| list_pools | Get resource pools for managing task concurrency and resource allocation. Use this tool when the user asks about:
Pools are used to limit parallelism for specific sets of tasks. Each pool has a certain number of slots, and tasks assigned to a pool will only run if there are available slots. This is useful for limiting concurrent access to resources like databases or external APIs. Returns pool information including:
Returns: JSON with list of all pools and their current utilization |
| list_plugins | Get information about installed Airflow plugins. Use this tool when the user asks about:
Plugins extend Airflow functionality by adding custom operators, hooks, views, menu items, or other components. This returns information about all plugins discovered by Airflow's plugin system. Returns information about installed plugins including:
Returns: JSON with list of all installed plugins and their components |
| list_providers | Get information about installed Airflow provider packages. Use this tool when the user asks about:
Returns information about installed provider packages including:
Returns: JSON with list of all installed provider packages and their details |
| get_variable | Get a specific Airflow variable by key. Use this tool when the user asks about:
Variables are key-value pairs stored in Airflow's metadata database that can be accessed by DAGs at runtime. They're commonly used for configuration values, API keys, or other settings that need to be shared across DAGs. Returns variable information including:
Args: variable_key: The key/name of the variable to retrieve Returns: JSON with the variable's key, value, and metadata |
| list_variables | Get all Airflow variables (key-value configuration pairs). Use this tool when the user asks about:
Variables are key-value pairs stored in Airflow's metadata database that can be accessed by DAGs at runtime. They're commonly used for configuration values, environment-specific settings, or other data that needs to be shared across DAGs without hardcoding in the DAG files. Returns variable information including:
IMPORTANT: Sensitive variables (like passwords, API keys) may have their values masked in the response for security reasons. Returns: JSON with list of all variables and their values |
| get_airflow_version | Get version information for the Airflow instance. Use this tool when the user asks about:
Returns version information including:
This is useful for:
Returns: JSON with Airflow version information |
| get_airflow_config | Get Airflow instance configuration and settings. Use this tool when the user asks about:
Returns all Airflow configuration organized by sections:
Each setting includes:
Returns: JSON with complete Airflow configuration organized by sections |
| explore_dag | Comprehensive investigation of a DAG - get all relevant info in one call. USE THIS TOOL WHEN you need to understand a DAG completely. Instead of making multiple calls, this returns everything about a DAG in a single response. This is the preferred first tool when:
Returns combined data:
Args: dag_id: The ID of the DAG to explore Returns: JSON with comprehensive DAG information |
| diagnose_dag_run | Diagnose issues with a specific DAG run - get run details and failed tasks. USE THIS TOOL WHEN troubleshooting a failed or problematic DAG run. Returns all the information you need to understand what went wrong. This is the preferred tool when:
Returns combined data:
Args: dag_id: The ID of the DAG dag_run_id: The ID of the DAG run (e.g., "manual__2024-01-01T00:00:00+00:00") Returns: JSON with diagnostic information about the DAG run |
| get_system_health | Get overall Airflow system health - import errors, warnings, and DAG stats. USE THIS TOOL WHEN you need a quick health check of the Airflow system. Returns a consolidated view of potential issues across the entire system. This is the preferred tool when:
Returns combined data:
Returns: JSON with system health overview |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
| troubleshoot_failed_dag | Step-by-step guide to troubleshoot a failed DAG. Args: dag_id: The DAG ID to troubleshoot |
| daily_health_check | Morning health check workflow for Airflow. |
| onboard_new_dag | Guide to understanding a new DAG. Args: dag_id: The DAG ID to learn about |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
| resource_version | Get Airflow version information as a resource. |
| resource_providers | Get installed Airflow providers as a resource. |
| resource_plugins | Get installed Airflow plugins as a resource. |
| resource_config | Get Airflow configuration as a resource. |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/astronomer/astro-airflow-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server