Apache Airflow MCP Server

AGENTS.md•4.78 KiB

# AGENTS.md Guidance for AI Agents working on the Airflow MCP Server. This file orients you at the package level (what belongs where, how to extend safely). For module-level guidance, see `src/airflow_mcp/AGENTS.md`. For tests, see `tests/AGENTS.md`. ## Project Layout - `src/airflow_mcp/` - `server.py`: FastMCP entrypoint. Registers tools and HTTP routes. Thin wrappers only. - `tools.py`: Business logic for MCP tools (structured dict outputs; FastMCP serializes). No FastMCP or web concerns. - `registry.py`: Instance registry loader (YAML with `${VAR}` env substitution) and cached accessor. - `client_factory.py`: Builds cached `apache-airflow-client` clients per instance; applies auth, SSL, timeouts. - `url_utils.py`: URL resolver/builder (instance detection, dag/task/run extraction, UI URL construction). - `validation.py`: Input validation (safe identifier patterns; SSRF guard via host check in resolver). - `observability.py`: Structured logging with request_id; common operation logger. - `errors.py`: Error types and `handle_errors` decorator (raises MCP `ToolError` with compact JSON payload). - `config.py`: Pydantic settings (`AIRFLOW_MCP_*`). - `tests/`: Unit and integration tests (mocked client; no real network). - `examples/instances.yaml`: Example instance registry used by tests. ## Development Workflow - Install & sync deps with `uv sync`. - Run locally: - `uv run airflow-mcp --transport stdio` - `uv run airflow-mcp --transport http --host 127.0.0.1 --port 8765` - Test & lint before pushing: - `uv run pytest` - `uv run ruff check .` - Optional format pass: `uv run ruff format .` - Commit hygiene: run tests + lint locally prior to opening a PR. ## Core Design Principles - Structured JSON tools: return dict payloads (FastMCP serializes). Failures raise `ToolError` with payload `{ "code": "...", "message": "...", "request_id": "...", "context"?: {...} }`. - Stateless: each call resolves target instance (via `instance` or `ui_url`). No cross-call state. - Thin server: wrappers forward to `tools.py` and apply `@handle_errors` plus annotations. - Multi-instance: registry-backed; unknown instances/hosts are rejected early. - Security first: validate identifiers, never log secrets, guard SSRF by exact hostname match. - Keep the tool surface compact: `airflow_list_task_instances` exposes the filters needed for failed-task discovery (`state`, `task_ids`), so new helpers should build on it rather than duplicating the workflow. - Client capability detection: the Airflow SDK evolves over time, so `list_task_instances` inspects the bound `TaskInstanceApi.get_task_instances` signature to decide which filter kwargs (`state`, `task_ids`) are supported. When bumping SDK versions, update the detection logic and the read-only tests together so we never pass unsupported kwargs. ## Tooling Contracts and Annotations - Read-only tools: annotated with `{"readOnlyHint": true, "destructiveHint": false, "idempotentHint": true}`. - Write tools: annotated with `{"destructiveHint": true, "idempotentHint": false}` so MCP clients prompt users. - URL precedence: when both `instance` and `ui_url` are provided, hosts must match or the call fails (`INSTANCE_MISMATCH`). - Always include a `request_id` in JSON responses (injected by `observability.OperationLogger`). ## Configuration (env) - Required for runtime: `AIRFLOW_MCP_INSTANCES_FILE` → path to instance registry YAML. - Optional: `AIRFLOW_MCP_DEFAULT_INSTANCE`, `AIRFLOW_MCP_HTTP_HOST`, `AIRFLOW_MCP_HTTP_PORT`, `AIRFLOW_MCP_TIMEOUT_SECONDS`, `AIRFLOW_MCP_LOG_FILE`, `AIRFLOW_MCP_HTTP_BLOCK_GET_ON_MCP`. - Per-instance credentials are referenced within the YAML and resolved from environment (see `examples/instances.yaml`). ## Observability and Logging - Use `observability.OperationLogger` around tool executions to emit: - `tool_start`, `tool_success/tool_error` with `duration_ms`, `response_bytes`, and context (instance, dag_id, etc.). - `request_id` is included in both logs and JSON payloads for correlation. ## Error Handling - Business logic raises `AirflowToolError` for user-facing validation errors. - Wrappers apply `@handle_errors` to log via `OperationLogger` and raise `ToolError` with a compact JSON payload; unexpected exceptions are masked as `INTERNAL_ERROR`. - Never include credentials in logs or error messages. Registry exposure is redacted to `auth_type`. ## Adding or Changing Tools (high level) 1. Implement logic in `tools.py` (validate → resolve instance/URL → call client → shape JSON → include UI URLs when relevant). 2. Register the tool in `server.py` with appropriate annotations and `@handle_errors`. 3. Add tests in `tests/` (read-only and error paths; write tools behind mocks and/or dry-run). 4. Update `README.md` and docs if the tool contract changes. ## References - Release plan: `docs/RELEASE_PLAN.md`.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/madamak/apache-airflow-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

AGENTS.md•4.78 KiB