Skip to main content
Glama
madamak

Apache Airflow MCP Server

by madamak

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
AIRFLOW_MCP_LOG_FILENoOptional path to a log file for server operations.
AIRFLOW_MCP_HTTP_HOSTNoThe host address for the HTTP server.127.0.0.1
AIRFLOW_MCP_HTTP_PORTNoThe port for the HTTP server.8765
AIRFLOW_MCP_INSTANCES_FILEYesPath to registry YAML listing available Airflow instances. Required for startup.
AIRFLOW_MCP_TIMEOUT_SECONDSNoRequest timeout in seconds for communicating with Airflow instances.30
AIRFLOW_MCP_DEFAULT_INSTANCENoThe default instance key to use from the instances YAML.
AIRFLOW_MCP_HTTP_BLOCK_GET_ON_MCPNoWhether to block GET requests on the MCP endpoint.true

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
airflow_list_instances

List configured Airflow instance keys.

Returns

  • Response dict: { "instances": [str], "default_instance": str | null, "request_id": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_describe_instance

Describe a configured Airflow instance (host + metadata, never secrets).

Parameters

  • instance: Instance key (e.g., "data-stg")

Returns

  • Response dict: { "instance", "host", "api_version", "verify_ssl", "auth_type", "request_id": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_resolve_url

Parse an Airflow UI URL, resolve instance and identifiers.

Parameters

  • url: Airflow UI URL (http/https)

Returns

  • Response dict: { "instance", "dag_id"?, "dag_run_id"?, "task_id"?, "try_number"?, "route", "request_id" }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_list_dags

List DAGs (pause state + UI link) for the target instance.

Parameters

  • instance: Instance key (optional; mutually exclusive with ui_url)

  • ui_url: Airflow UI URL to resolve instance (optional; takes precedence - must match a configured host)

  • limit: Max results (default 100; accepts int/float/str, coerced to non-negative int, fractional values truncated)

  • offset: Offset for pagination (default 0; accepts int/float/str, coerced to non-negative int, fractional values truncated)

Returns

  • Response dict: { "dags": [{ "dag_id", "is_paused", "ui_url" }], "count": int, "request_id": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_get_dag

Get DAG details and a UI link.

Parameters

  • instance | ui_url: Provide one; ui_url auto-resolves/validates the host.

  • dag_id: Required when only instance is supplied.

Returns

  • Response dict: { "dag": object, "ui_url": str, "request_id": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_list_dag_runs

List DAG runs (defaults to execution_date DESC) with per-run UI URLs.

Parameters

  • instance: Instance key (optional)

  • ui_url: Airflow UI URL to resolve instance/dag_id (optional)

  • dag_id: DAG identifier (required if ui_url not provided)

  • limit: Max results (default 100; accepts int/float/str, coerced to non-negative int, fractional values truncated)

  • offset: Offset for pagination (default 0; accepts int/float/str, coerced to non-negative int, fractional values truncated)

  • state: List of states to filter by (optional)

  • order_by: Optional "start_date", "end_date", or "execution_date" (omit to use execution_date)

  • descending: Sort direction (default True). Ignored when order_by is omitted; defaults always use execution_date descending

Returns

  • Response dict: { "dag_runs": [{ "dag_run_id", "state", "start_date", "end_date", "ui_url" }], "count": int, "request_id": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_get_dag_run

Get a single DAG run and a UI link.

Parameters

  • instance: Instance key (optional)

  • ui_url: Airflow UI URL to resolve instance/dag/dag_run (optional)

  • dag_id: DAG identifier

  • dag_run_id: DAG run identifier

Returns

  • Response dict: { "dag_run": object, "ui_url": str, "request_id": str }

airflow_list_task_instances

List task instances for a DAG run (state, try_number, per-attempt log URL).

Parameters

  • instance: Instance key (optional)

  • ui_url: Airflow UI URL to resolve instance/dag/dag_run (optional)

  • dag_id: DAG identifier

  • dag_run_id: DAG run identifier

  • limit: Max results (default 100; accepts int/float/str, coerced to non-negative int, fractional values truncated)

  • offset: Offset for pagination (default 0; accepts int/float/str, coerced to non-negative int, fractional values truncated)

  • state: Optional list of task states (case-insensitive). When provided, only matching states are returned.

  • task_ids: Optional list of task identifiers to include.

Returns

  • Response dict: { "task_instances": [{ "task_id", "state", "try_number", "ui_url" }], "count": int, "total_entries"?: int, "filters"?: { "state": [...], "task_ids": [...] }, "request_id": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_get_task_instance

Return task metadata, config, attempt summary, optional rendered fields, and UI URLs.

Parameters

  • instance | ui_url: Target selection (URL precedence)

  • dag_id, dag_run_id, task_id: Required identifiers (unless resolved from ui_url)

  • include_rendered: When true, include rendered template fields (truncated using max_rendered_bytes)

  • max_rendered_bytes: Byte cap for rendered fields payload (default 100KB; accepts int/float/str, coerced to positive int, fractional values truncated)

Returns

  • Response dict: { "task_instance": {...}, "task_config": {...}, "attempts": {...}, "ui_url": {...}, "request_id": str, "rendered_fields"?: {...} }

Notes

  • attempts.try_number is the authoritative input for airflow_get_task_instance_logs.

  • Rendered fields include bytes_returned and truncated metadata.

  • Sensors increment try_number on every reschedule, so treat it as an attempt index; the derived retries counters are heuristic.

airflow_get_task_instance_logs

Fetch task instance logs with optional filtering and truncation.

Large log handling: Logs >100MB automatically tail to last 10,000 lines (sets auto_tailed=true). Host-segmented responses are flattened into a single string using headers of the form --- [worker] ---, ensuring agents can reason about multi-host output. The tool requires an explicit try_number; callers should first retrieve it via airflow_get_task_instance.

Filter order of operations:

  1. Auto-tail: If log >100MB, take last 10,000 lines

  2. tail_lines: Extract last N lines from log

  3. filter_level: Find matching lines by level (content filter)

  4. context_lines: Add surrounding lines around matches (symmetric: N before + N after)

  5. max_bytes: Hard cap on total output (UTF-8 safe truncation)

Parameters

  • instance: Instance key (optional, mutually exclusive with ui_url)

  • ui_url: Airflow UI URL to resolve identifiers (optional)

  • dag_id, dag_run_id, task_id, try_number: Task instance identifiers (required)

  • filter_level: "error" | "warning" | "info" (optional) - Show only lines matching level

    • "error": ERROR, CRITICAL, FATAL, Exception, Traceback

    • "warning": WARN, WARNING + error patterns

    • "info": INFO + warning + error patterns

  • context_lines: N lines before/after each match (optional, clamped to [0, 1000]; accepts int/float/str, coerced to non-negative int, fractional values truncated)

  • tail_lines: Extract last N lines before filtering (optional, clamped to [0, 100000]; accepts int/float/str, coerced to non-negative int, fractional values truncated)

  • max_bytes: Maximum response size in bytes (default: 100KB ≈ 25K tokens, clamped to reasonable limit)

Returns

  • Response dict with fields:

    • log: Normalized/filtered log text (host headers inserted when needed)

    • truncated: true if output exceeded max_bytes

    • auto_tailed: true if original log >100MB triggered auto-tail

    • bytes_returned: Actual byte size of returned log

    • original_lines: Line count before any filtering

    • returned_lines: Line count after all filtering/truncation

    • match_count: Number of lines matching filter_level (before context expansion)

    • meta.try_number: Attempt number for this task instance

    • meta.filters: Echo of effective filters applied (shows clamped values)

    • ui_url: Direct link to log view in Airflow UI

    • request_id: Correlates with server logs

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_dataset_events

List dataset events.

Parameters

  • instance: Instance key (optional)

  • ui_url: Airflow UI URL to resolve instance (optional)

  • dataset_uri: Dataset URI (required)

  • limit: Max results (default 50; accepts int/float/str, coerced to non-negative int, fractional values truncated)

Returns

  • Response dict: { "events": [object], "count": int, "request_id": str }

airflow_trigger_dag

Trigger a DAG run with optional configuration.

Parameters

  • instance: Instance key (optional; mutually exclusive with ui_url)

  • ui_url: Airflow UI URL to resolve instance (optional; takes precedence)

  • dag_id: DAG identifier (required if ui_url not provided)

  • dag_run_id: Custom run id (optional)

  • logical_date: Logical date/time for run (optional; ISO8601)

  • conf: Configuration object as dict or JSON string (optional)

  • note: Run note/comment (optional)

Returns

  • Response dict: { "dag_run_id": str, "ui_url": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_clear_task_instances

Clear task instances for a DAG across one or more runs using Airflow's native filter set (destructive).

Parameters

  • instance: Instance key (optional; mutually exclusive with ui_url)

  • ui_url: Airflow UI URL to resolve instance (optional; takes precedence)

  • dag_id: DAG identifier (required if ui_url not provided)

  • task_ids: List of task IDs to clear (optional)

  • start_date: ISO8601 start date filter (optional)

  • end_date: ISO8601 end date filter (optional)

  • include_subdags: Include subDAGs (optional)

  • include_parentdag: Include parent DAG (optional)

  • include_upstream: Include upstream tasks (optional)

  • include_downstream: Include downstream tasks (optional)

  • include_future: Include future runs (optional)

  • include_past: Include past runs (optional)

  • dry_run: If true, perform a dry-run only (optional)

  • reset_dag_runs: Reset DagRun state (optional)

Returns

  • Response dict: { "dag_id": str, "cleared": object, "request_id": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_clear_dag_run

Clear all task instances in a specific DAG run (destructive).

Parameters

  • instance: Instance key (optional; mutually exclusive with ui_url)

  • ui_url: Airflow UI URL to resolve instance (optional; takes precedence)

  • dag_id: DAG identifier (required if ui_url not provided)

  • dag_run_id: DAG run identifier (required if ui_url not provided)

  • include_subdags: Include subDAGs (optional)

  • include_parentdag: Include parent DAG (optional)

  • include_upstream: Include upstream tasks (optional)

  • include_downstream: Include downstream tasks (optional)

  • dry_run: If true, perform a dry-run only (optional)

  • reset_dag_runs: Reset DagRun state (optional)

Returns

  • Response dict: { "dag_id": str, "dag_run_id": str, "cleared": object, "request_id": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_pause_dag

Pause DAG scheduling (sets is_paused=True and returns UI link).

Parameters

  • instance: Instance key (optional; mutually exclusive with ui_url)

  • ui_url: Airflow UI URL to resolve instance (optional; takes precedence)

  • dag_id: DAG identifier (required if ui_url not provided)

Returns

  • Response dict: { "dag_id": str, "is_paused": true, "ui_url": str, "request_id": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

airflow_unpause_dag

Resume DAG scheduling (sets is_paused=False and returns UI link).

Parameters

  • instance: Instance key (optional; mutually exclusive with ui_url)

  • ui_url: Airflow UI URL to resolve instance (optional; takes precedence)

  • dag_id: DAG identifier (required if ui_url not provided)

Returns

  • Response dict: { "dag_id": str, "is_paused": false, "ui_url": str, "request_id": str }

  • Raises: ToolError with compact JSON payload (code, message, request_id, optional context)

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/madamak/apache-airflow-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server