get_datasets
Retrieve and list datasets from Apache Airflow deployments with filtering options for DAGs, URI patterns, and pagination controls.
Instructions
List datasets
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| offset | No | ||
| order_by | No | ||
| uri_pattern | No | ||
| dag_ids | No |
Implementation Reference
- src/airflow/dataset.py:42-64 (handler)The main handler function for the 'get_datasets' tool. It accepts optional parameters for filtering datasets, builds a kwargs dict, calls the underlying Airflow DatasetApi.get_datasets, and returns the response as a TextContent object.async def get_datasets( limit: Optional[int] = None, offset: Optional[int] = None, order_by: Optional[str] = None, uri_pattern: Optional[str] = None, dag_ids: Optional[str] = None, ) -> List[Union[types.TextContent, types.ImageContent, types.EmbeddedResource]]: # Build parameters dictionary kwargs: Dict[str, Any] = {} if limit is not None: kwargs["limit"] = limit if offset is not None: kwargs["offset"] = offset if order_by is not None: kwargs["order_by"] = order_by if uri_pattern is not None: kwargs["uri_pattern"] = uri_pattern if dag_ids is not None: kwargs["dag_ids"] = dag_ids response = dataset_api.get_datasets(**kwargs) return [types.TextContent(type="text", text=str(response.to_dict()))]
- src/airflow/dataset.py:11-39 (registration)The get_all_functions() in dataset.py defines and returns the list of dataset-related tools for registration, including the specific tuple for 'get_datasets'.def get_all_functions() -> list[tuple[Callable, str, str, bool]]: """Return list of (function, name, description, is_read_only) tuples for registration.""" return [ (get_datasets, "get_datasets", "List datasets", True), (get_dataset, "get_dataset", "Get a dataset by URI", True), (get_dataset_events, "get_dataset_events", "Get dataset events", True), (create_dataset_event, "create_dataset_event", "Create dataset event", False), (get_dag_dataset_queued_event, "get_dag_dataset_queued_event", "Get a queued Dataset event for a DAG", True), (get_dag_dataset_queued_events, "get_dag_dataset_queued_events", "Get queued Dataset events for a DAG", True), ( delete_dag_dataset_queued_event, "delete_dag_dataset_queued_event", "Delete a queued Dataset event for a DAG", False, ), ( delete_dag_dataset_queued_events, "delete_dag_dataset_queued_events", "Delete queued Dataset events for a DAG", False, ), (get_dataset_queued_events, "get_dataset_queued_events", "Get queued Dataset events for a Dataset", True), ( delete_dataset_queued_events, "delete_dataset_queued_events", "Delete queued Dataset events for a Dataset", False, ), ]
- src/main.py:78-92 (registration)The main registration loop in main.py that imports get_all_functions from dataset.py (via get_dataset_functions) and calls app.add_tool for each tool, including 'get_datasets', making it available in the MCP server.for api in apis: logging.debug(f"Adding API: {api}") get_function = APITYPE_TO_FUNCTIONS[APIType(api)] try: functions = get_function() except NotImplementedError: continue # Filter functions for read-only mode if requested if read_only: functions = filter_functions_for_read_only(functions) for func, name, description, *_ in functions: app.add_tool(func, name=name, description=description)