Skip to main content
Glama
knishioka

Treasure Data MCP Server

by knishioka

td_list_projects

List workflow projects to discover data pipelines, scheduled jobs, and processing workflows. Browse project names and IDs for navigation or detailed exploration.

Instructions

List workflow projects to find data pipelines and scheduled jobs.

Shows all workflow projects containing Digdag workflows, SQL queries, and
Python scripts. Returns names/IDs for navigation or verbose=True for details.

Common scenarios:
- Discover available data processing workflows
- Find specific project by browsing names
- Get project IDs for detailed exploration
- Audit workflow projects in the account
- List user projects (exclude system with include_system=False)

Projects contain .dig files defining scheduled data pipelines.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
verboseNo
limitNo
offsetNo
all_resultsNo
include_systemNo

Implementation Reference

  • The core handler function for the 'td_list_projects' MCP tool. It creates a TreasureDataClient instance with workflow support, fetches projects via client.get_projects() with pagination parameters, filters out system projects by default (those with 'sys' metadata), and returns either project names/IDs (default) or full details (verbose=True). Handles API errors gracefully.
    @mcp.tool()
    async def td_list_projects(
        verbose: bool = False,
        limit: int = DEFAULT_LIMIT,
        offset: int = 0,
        all_results: bool = False,
        include_system: bool = False,
    ) -> dict[str, Any]:
        """List workflow projects to find data pipelines and scheduled jobs.
    
        Shows all workflow projects containing Digdag workflows, SQL queries, and
        Python scripts. Returns names/IDs for navigation or verbose=True for details.
    
        Common scenarios:
        - Discover available data processing workflows
        - Find specific project by browsing names
        - Get project IDs for detailed exploration
        - Audit workflow projects in the account
        - List user projects (exclude system with include_system=False)
    
        Projects contain .dig files defining scheduled data pipelines.
        """
        client = _create_client(include_workflow=True)
        if isinstance(client, dict):
            return client
    
        try:
            projects = client.get_projects(
                limit=limit, offset=offset, all_results=all_results
            )
    
            # Filter out system projects (those with "sys" metadata)
            if not include_system:
                projects = [
                    p for p in projects if not any(meta.key == "sys" for meta in p.metadata)
                ]
    
            if verbose:
                # Return full project details
                return {"projects": [project.model_dump() for project in projects]}
            else:
                # Return only project names and ids
                return {
                    "projects": [
                        {"id": project.id, "name": project.name} for project in projects
                    ]
                }
        except (ValueError, requests.RequestException) as e:
            return _format_error_response(f"Failed to retrieve projects: {str(e)}")
        except Exception as e:
            return _format_error_response(
                f"Unexpected error while retrieving projects: {str(e)}"
            )
  • Pydantic BaseModel defining the structure of a Treasure Data workflow Project, used for input/output validation and serialization in the td_list_projects tool's response (via project.model_dump()). Includes fields like id, name, revision, timestamps, archive info, and metadata.
    class Project(BaseModel):
        """
        Model representing a Treasure Data workflow project.
    
        In Treasure Data, a workflow project is a container for workflow definitions,
        which typically include SQL queries and Digdag files (.dig) that define
        the workflow execution steps and dependencies. These workflows are used
        for data processing, analytics pipelines, and scheduled jobs.
        """
    
        id: str
        name: str
        revision: str
        created_at: str = Field(..., alias="createdAt")
        updated_at: str = Field(..., alias="updatedAt")
        deleted_at: str | None = Field(None, alias="deletedAt")
        archive_type: str = Field(..., alias="archiveType")
        archive_md5: str = Field(..., alias="archiveMd5")
        metadata: list[Metadata] = []
  • Supporting method in TreasureDataClient that performs the actual API request to fetch workflow projects from Treasure Data's workflow API, handles pagination (mapping limit/offset to 'count' param, client-side slicing), parses JSON response into Project Pydantic models, and returns the list used by the tool handler.
    def get_projects(
        self,
        limit: int = 30,
        offset: int = 0,
        all_results: bool = False,
    ) -> list[Project]:
        """
        Retrieve a list of workflow projects with pagination support.
    
        Workflow projects in Treasure Data contain workflow definitions used for
        data processing and analytics. Each project typically includes SQL queries
        and Digdag (.dig) files that define workflow execution steps and dependencies.
        These workflows are executed on the Treasure Data platform for scheduled
        data pipelines, ETL processes, and other automation tasks.
    
        Note: The API uses 'count' parameter for limiting results, but this method
        provides limit/offset interface for consistency with other methods.
    
        Args:
            limit: Maximum number of projects to retrieve (defaults to 30)
            offset: Index to start retrieving from (defaults to 0)
            all_results: If True, retrieves all projects ignoring limit and offset
    
        Returns:
            A list of Project objects representing workflow projects
    
        Raises:
            requests.HTTPError: If the API returns an error response
        """
        # The projects API uses 'count' parameter, not limit/offset
        # Request more data if offset is specified
        # Increased to 200 to cover all projects (currently ~135)
        count = 200 if all_results else min(offset + limit, 200)
    
        params = {"count": count}
        response = self._make_request(
            "GET", "projects", base_url=self.workflow_base_url, params=params
        )
        all_projects = [Project(**project) for project in response.get("projects", [])]
    
        if all_results:
            return all_projects
        else:
            # Apply offset and limit on the client side
            end_index = min(offset + limit, len(all_projects))
            return all_projects[offset:end_index]
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses that the tool returns 'names/IDs for navigation or verbose=True for details', indicating output behavior, and mentions project contents ('.dig files'). However, it lacks details on permissions, rate limits, pagination (beyond parameters), or error handling, which are important for a list operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded with the core purpose. The 'Common scenarios' section adds value without redundancy, though it could be slightly more streamlined by integrating scenarios into the main flow. Every sentence contributes to understanding the tool's use.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters with 0% schema coverage and no output schema or annotations, the description is incomplete. It covers the tool's purpose and some parameter semantics but lacks details on return format, error cases, or full parameter explanations. For a list tool with multiple parameters, this leaves gaps in operational context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It explains 'verbose=True for details' and 'exclude system with include_system=False', adding meaning to two parameters. It does not cover 'limit', 'offset', or 'all_results', but the context of listing and discovery implies their use for pagination, partially compensating for the low coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('List') and resource ('workflow projects'), specifying they contain 'Digdag workflows, SQL queries, and Python scripts'. It distinguishes from siblings like td_get_project (detailed view) and td_find_project (search-based), making the scope explicit for discovery vs. retrieval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides clear context for when to use this tool ('Discover available data processing workflows', 'Find specific project by browsing names', 'Get project IDs for detailed exploration'), and hints at exclusions ('exclude system with include_system=False'). However, it does not explicitly name alternatives like td_find_project for targeted searches, leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/knishioka/td-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server