Skip to main content
Glama
warrenzhu25

Dataproc MCP Server

by warrenzhu25

list_batch_jobs

Retrieve and display Dataproc batch jobs from Google Cloud by specifying project ID and region. Supports pagination for managing large job lists.

Instructions

List Dataproc batch jobs.

Args:
    project_id: Google Cloud project ID
    region: Dataproc region
    page_size: Number of results per page

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
project_idYes
regionYes
page_sizeNo

Implementation Reference

  • MCP tool handler for list_batch_jobs: decorated with @mcp.tool(), handles input parameters, instantiates DataprocBatchClient, calls its list_batch_jobs method, and returns stringified result or error.
    @mcp.tool()
    async def list_batch_jobs(project_id: str, region: str, page_size: int = 100) -> str:
        """List Dataproc batch jobs.
    
        Args:
            project_id: Google Cloud project ID
            region: Dataproc region
            page_size: Number of results per page
        """
        batch_client = DataprocBatchClient()
        try:
            result = await batch_client.list_batch_jobs(project_id, region, page_size)
            return str(result)
        except Exception as e:
            logger.error("Failed to list batch jobs", error=str(e))
            return f"Error: {str(e)}"
  • Core implementation of list_batch_jobs in DataprocBatchClient: uses Google Cloud Dataproc BatchControllerClient to list batches, extracts relevant info like batch_id, state, create_time, job_type, and returns structured dict.
    async def list_batch_jobs(
        self, project_id: str, region: str, page_size: int = 100
    ) -> dict[str, Any]:
        """List batch jobs."""
        try:
            loop = asyncio.get_event_loop()
            client = self._get_batch_client(region)
    
            request = types.ListBatchesRequest(
                parent=f"projects/{project_id}/locations/{region}", page_size=page_size
            )
    
            response = await loop.run_in_executor(None, client.list_batches, request)
    
            batches = []
            for batch in response:
                batches.append(
                    {
                        "batch_id": batch.name.split("/")[-1],
                        "state": batch.state.name,
                        "create_time": batch.create_time.isoformat()
                        if batch.create_time
                        else None,
                        "job_type": self._get_batch_job_type(batch),
                        "operation": batch.operation if batch.operation else None,
                    }
                )
    
            return {
                "batches": batches,
                "total_count": len(batches),
                "project_id": project_id,
                "region": region,
            }
    
        except Exception as e:
            logger.error("Failed to list batch jobs", error=str(e))
            raise

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/warrenzhu25/dataproc-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server