Skip to main content
Glama

list_datasets

Retrieve all datasets from Airflow v1 API to monitor data dependencies and pipeline connections in your workflow system.

Instructions

[Tool Role]: Lists all datasets in the Airflow system (v1 API only - v2 uses Assets).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
limitNo
offsetNo
uri_patternNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The main handler function for the 'list_datasets' tool. It is decorated with @mcp.tool() which handles registration. The function lists datasets via Airflow API v1, with support for pagination and URI filtering, and provides v2 compatibility note.
    @mcp.tool()
    async def list_datasets(limit: int = 20, offset: int = 0, uri_pattern: Optional[str] = None) -> Dict[str, Any]:
        """[Tool Role]: Lists all datasets in the Airflow system (v1 API only - v2 uses Assets)."""
        from ..functions import get_api_version
        
        api_version = get_api_version()
        if api_version == "v2":
            return {
                "error": "Dataset API is not available in Airflow 3.x (API v2)", 
                "available_in": "v1 only",
                "v2_alternative": "Use list_assets() for Airflow 3.x data-aware scheduling"
            }
        
        params = []
        params.append(f"limit={limit}")
        if offset > 0:
            params.append(f"offset={offset}")
        if uri_pattern:
            params.append(f"uri_pattern={uri_pattern}")
        
        query_string = "&".join(params) if params else ""
        endpoint = f"/datasets?{query_string}" if query_string else "/datasets"
        
        resp = await airflow_request("GET", endpoint)
        resp.raise_for_status()
        return resp.json()
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions the API version constraint, which is useful context, but doesn't describe pagination behavior (implied by limit/offset params), authentication needs, rate limits, error conditions, or what 'lists all datasets' entails operationally. For a list tool with 3 parameters and no annotation coverage, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose and adds the API version constraint as crucial context. Every word earns its place with zero redundancy or wasted space.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (list operation with 3 parameters), no annotations, but presence of an output schema, the description provides basic purpose and API version context. However, it lacks parameter explanations, behavioral details, and sibling differentiation that would make it complete. The output schema existence reduces the need to describe return values, but other gaps remain significant.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the schema provides no parameter documentation. The description mentions no parameters at all, failing to explain what 'limit', 'offset', or 'uri_pattern' do or how they affect the listing. While 0 parameters would warrant a baseline of 4, this tool has 3 undocumented parameters that the description completely ignores.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Lists') and resource ('all datasets in the Airflow system'), providing specific functionality. It distinguishes from siblings by specifying 'v1 API only - v2 uses Assets', which helps differentiate from potential v2 dataset tools. However, it doesn't explicitly contrast with other dataset-related siblings like 'get_dataset' or 'list_dataset_events'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through 'v1 API only - v2 uses Assets', suggesting this tool is for v1 systems. However, it doesn't provide explicit guidance on when to use this vs. alternatives like 'get_dataset' (for single dataset) or 'list_dataset_events', nor does it mention prerequisites or exclusions beyond the API version note.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/call518/MCP-Airflow-API'

If you have feedback or need assistance with the MCP directory API, please join our Discord server