Databricks MCP Server

Overview Schema Related Servers Score Discussions

list_jobs

Retrieve Databricks job listings with pagination and filtering options to manage and monitor scheduled workflows efficiently.

Instructions

List Databricks jobs with pagination and filtering.

Args:
    limit: Number of jobs to return (default: 25, keeps response under token limits)
    offset: Starting position for pagination (default: 0, use pagination_info.next_offset for next page)
    created_by: Filter by creator email (e.g. 'user@company.com'), case-insensitive, optional
    include_run_status: Include latest run status and duration (default: true, set false for faster response)

Returns:
    JSON with jobs array and pagination_info. Each job includes latest_run with state, duration_minutes, etc.
    Use pagination_info.next_offset for next page. Total jobs shown in pagination_info.total_jobs.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`limit`	No
`offset`	No
`created_by`	No
`include_run_status`	No

Implementation Reference

src/server/simple_databricks_mcp_server.py:96-194 (handler)

The primary MCP tool handler for 'list_jobs'. Registers the tool via @mcp.tool(), implements client-side pagination, creator filtering, and enriches each job with latest run status by calling jobs.list_runs(job_id, limit=1). Returns formatted JSON.

@mcp.tool()
async def list_jobs(
    limit: int = 25, 
    offset: int = 0, 
    created_by: Optional[str] = None,
    include_run_status: bool = True
) -> str:
    """List Databricks jobs with pagination and filtering.
    
    Args:
        limit: Number of jobs to return (default: 25, keeps response under token limits)
        offset: Starting position for pagination (default: 0, use pagination_info.next_offset for next page)
        created_by: Filter by creator email (e.g. 'user@company.com'), case-insensitive, optional
        include_run_status: Include latest run status and duration (default: true, set false for faster response)
    
    Returns:
        JSON with jobs array and pagination_info. Each job includes latest_run with state, duration_minutes, etc.
        Use pagination_info.next_offset for next page. Total jobs shown in pagination_info.total_jobs.
    """
    logger.info(f"Listing jobs (limit={limit}, offset={offset}, created_by={created_by})")
    try:
        # Fetch all jobs from API
        result = await jobs.list_jobs()
        
        if "jobs" in result:
            all_jobs = result["jobs"]
            
            # Filter by creator if specified
            if created_by:
                all_jobs = [job for job in all_jobs 
                           if job.get("creator_user_name", "").lower() == created_by.lower()]
            
            total_jobs = len(all_jobs)
            
            # Apply client-side pagination
            start_idx = offset
            end_idx = offset + limit
            paginated_jobs = all_jobs[start_idx:end_idx]
            
            # Enhance jobs with run status if requested
            if include_run_status and paginated_jobs:
                enhanced_jobs = []
                for job in paginated_jobs:
                    enhanced_job = job.copy()
                    
                    # Get most recent run for this job
                    try:
                        runs_result = await jobs.list_runs(job_id=job["job_id"], limit=1)
                        if "runs" in runs_result and runs_result["runs"]:
                            latest_run = runs_result["runs"][0]
                            
                            # Add run status info
                            enhanced_job["latest_run"] = {
                                "run_id": latest_run.get("run_id"),
                                "state": latest_run.get("state", {}).get("life_cycle_state"),
                                "result_state": latest_run.get("state", {}).get("result_state"),
                                "start_time": latest_run.get("start_time"),
                                "end_time": latest_run.get("end_time"),
                            }
                            
                            # Calculate duration if both times available
                            start_time = latest_run.get("start_time")
                            end_time = latest_run.get("end_time")
                            if start_time and end_time:
                                duration_ms = end_time - start_time
                                enhanced_job["latest_run"]["duration_seconds"] = duration_ms // 1000
                                enhanced_job["latest_run"]["duration_minutes"] = duration_ms // 60000
                        else:
                            enhanced_job["latest_run"] = {"status": "no_runs"}
                            
                    except Exception as e:
                        enhanced_job["latest_run"] = {"error": f"Failed to get run info: {str(e)}"}
                    
                    enhanced_jobs.append(enhanced_job)
                
                paginated_jobs = enhanced_jobs
            
            # Create paginated response
            paginated_result = {
                "jobs": paginated_jobs,
                "pagination_info": {
                    "total_jobs": total_jobs,
                    "returned": len(paginated_jobs),
                    "limit": limit,
                    "offset": offset,
                    "has_more": end_idx < total_jobs,
                    "next_offset": end_idx if end_idx < total_jobs else None,
                    "filtered_by": {"created_by": created_by} if created_by else None
                }
            }
            
            return json.dumps(paginated_result)
        else:
            return json.dumps(result)
            
    except Exception as e:
        logger.error(f"Error listing jobs: {str(e)}")
        return json.dumps({"error": str(e)})

src/api/jobs.py:54-81 (helper)

Low-level helper function that makes the actual Databricks API call to /api/2.0/jobs/list. Called by the MCP handler.

async def list_jobs(limit: Optional[int] = None, page_token: Optional[str] = None) -> Dict[str, Any]:
    """
    List jobs with optional pagination.
    
    Args:
        limit: Maximum number of jobs to return (1-100, default: 20)
        page_token: Token for pagination (from previous response's next_page_token)
    
    Returns:
        Response containing a list of jobs and optional next_page_token
        
    Raises:
        DatabricksAPIError: If the API request fails
    """
    params = {}
    if limit is not None:
        # Databricks API limits: 1-100 for jobs list
        if limit < 1:
            limit = 1
        elif limit > 100:
            limit = 100
        params["limit"] = limit
    if page_token is not None:
        params["page_token"] = page_token
        
    logger.info(f"Listing jobs (limit={limit}, page_token={'***' if page_token else None})")
    return make_api_request("GET", "/api/2.0/jobs/list", params=params if params else None)

src/api/jobs.py:175-196 (helper)

Supporting helper for listing job runs (/api/2.0/jobs/runs/list), used by the handler to fetch latest run status for each job.

async def list_runs(job_id: Optional[int] = None, limit: Optional[int] = None) -> Dict[str, Any]:
    """
    List job runs, optionally filtered by job_id.
    
    Args:
        job_id: ID of the job to list runs for (optional)
        limit: Maximum number of runs to return (optional)
        
    Returns:
        Response containing a list of job runs
        
    Raises:
        DatabricksAPIError: If the API request fails
    """
    params = {}
    if job_id is not None:
        params["job_id"] = job_id
    if limit is not None:
        params["limit"] = limit
        
    logger.info(f"Listing runs (job_id={job_id}, limit={limit})")
    return make_api_request("GET", "/api/2.0/jobs/runs/list", params=params if params else None)

Tool Definition Quality

A4.1/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and does well by disclosing key behavioral traits: it mentions pagination mechanics, token limit considerations ('keeps response under token limits'), performance trade-offs ('set false for faster response'), and case-insensitive filtering. It also details the return structure, including pagination info and job data with latest run details. However, it lacks information on rate limits, authentication requirements, or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and well-structured: it starts with a clear purpose statement, then details arguments with explanations, and concludes with return value information. Every sentence adds value—no fluff or repetition. The use of sections (Args, Returns) enhances readability without unnecessary verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (4 parameters, no annotations, no output schema), the description is largely complete: it covers purpose, parameters, return format, and pagination behavior. However, it lacks context on authentication, error cases, or rate limits, which are important for a listing tool in a cloud service like Databricks. The absence of an output schema is mitigated by the detailed return description.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, so the description must compensate fully, which it does excellently. It adds meaningful semantics for all 4 parameters: explains 'limit' default and token limit rationale, describes 'offset' usage with pagination guidance, specifies 'created_by' format and case-insensitivity, and clarifies 'include_run_status' impact on performance. This goes well beyond the basic schema to provide practical usage context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'List Databricks jobs with pagination and filtering.' It specifies the verb ('List') and resource ('Databricks jobs'), and distinguishes it from siblings like 'list_job_runs' by focusing on jobs rather than runs. However, it doesn't explicitly contrast with other listing tools like 'list_clusters' or 'list_notebooks' beyond the resource type.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for listing jobs with optional filtering and pagination, but doesn't explicitly state when to use this tool versus alternatives. For example, it doesn't clarify if this should be used over 'list_job_runs' for job metadata or when filtering by creator is needed. The guidance is limited to functional parameters rather than contextual decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/samhavens/databricks-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server