EXAMPLES.md•17.3 kB
# Usage Examples
This document provides practical examples of common workflows and use cases with the Databricks MCP Server.
## Table of Contents
- [Cluster Management](#cluster-management)
- [Notebook Operations](#notebook-operations)
- [Job Orchestration](#job-orchestration)
- [Data Management](#data-management)
- [User & Access Management](#user--access-management)
- [Cost Optimization](#cost-optimization)
- [Advanced Workflows](#advanced-workflows)
## Cluster Management
### Example 1: Create a Development Cluster
**User Query:**
> "Create a small development cluster with autoscaling for my data science work"
**Expected AI Response:**
The AI will use the `create_cluster` tool with these parameters:
- Cluster name: "data-science-dev"
- Spark version: Latest LTS
- Node type: Small instance (e.g., "i3.xlarge")
- Autoscaling: 1-4 workers
- Auto-termination: 30 minutes
**Result:**
```json
{
"message": "Cluster creation initiated",
"cluster_id": "0123-456789-abc123",
"cluster_name": "data-science-dev"
}
```
### Example 2: Find and Terminate Idle Clusters
**User Query:**
> "Show me all running clusters and terminate any that have been idle for more than 2 hours"
**Workflow:**
1. AI calls `list_clusters`
2. AI calls `get_cluster_details` for each running cluster
3. AI identifies idle clusters based on `last_activity_time`
4. AI calls `terminate_cluster` for idle clusters
**Sample Output:**
> "Found 3 running clusters. Cluster 'old-analysis' has been idle for 5 hours and cluster 'test-cluster' for 3 hours. I've terminated both, saving approximately $15/hour. The 'production-etl' cluster is active and will remain running."
### Example 3: Monitor Cluster Health
**User Query:**
> "Check the status of cluster abc123 and show me recent events"
**Workflow:**
1. AI calls `get_cluster_details` with cluster_id
2. AI calls `get_cluster_events` for recent activity
**Response:**
> "Cluster 'ml-training' is currently RUNNING. It was started 3 hours ago and has 8 workers. Recent events show it successfully scaled from 4 to 8 workers 45 minutes ago to handle increased load."
## Notebook Operations
### Example 4: Organize Notebooks
**User Query:**
> "Create a new project folder and set up notebooks for ETL, analysis, and reporting"
**Workflow:**
1. AI calls `create_directory` for `/Users/me/sales-analytics`
2. AI calls `create_notebook` three times for:
- `/Users/me/sales-analytics/01-ETL`
- `/Users/me/sales-analytics/02-Analysis`
- `/Users/me/sales-analytics/03-Reporting`
**Response:**
> "Created project structure:
> - sales-analytics/
> - 01-ETL (Python)
> - 02-Analysis (Python)
> - 03-Reporting (SQL)"
### Example 5: Execute Notebook Pipeline
**User Query:**
> "Run the data-processing notebook with parameters start_date='2024-01-01' and end_date='2024-01-31'"
**Workflow:**
1. AI calls `run_notebook` with:
- notebook_path: "/Users/me/data-processing"
- cluster_id: (existing cluster)
- base_parameters: {"start_date": "2024-01-01", "end_date": "2024-01-31"}
2. AI polls for completion
3. AI returns results
**Response:**
> "Notebook executed successfully in 5 minutes 23 seconds. Processed 1.2M records. Output: 'ETL completed, loaded 1,245,678 rows to sales_summary table.'"
### Example 6: Backup Notebooks
**User Query:**
> "Export all my notebooks to a backup folder"
**Workflow:**
1. AI calls `list_notebooks` for user's directory
2. AI calls `read_notebook` for each notebook
3. AI provides backup content or instructions
## Job Orchestration
### Example 7: Create Scheduled ETL Job
**User Query:**
> "Create a job that runs my ETL notebook every day at 2 AM and sends me an email on failure"
**Workflow:**
AI calls `create_job` with:
```json
{
"name": "Daily ETL Job",
"cluster_id": "existing-cluster-id",
"notebook_path": "/Users/me/etl-notebook",
"schedule_cron": "0 0 2 * * ?",
"schedule_timezone": "America/Los_Angeles",
"email_on_failure": ["user@company.com"],
"max_retries": 2
}
```
**Response:**
> "Created 'Daily ETL Job' (ID: 123) scheduled to run daily at 2:00 AM Pacific Time. You'll receive email notifications on failure. The job will retry up to 2 times on failure."
### Example 8: Monitor Job Execution
**User Query:**
> "What's the status of my recent job runs?"
**Workflow:**
1. AI calls `list_jobs` to get user's jobs
2. AI calls `list_job_runs` for recent runs
3. AI summarizes status
**Response:**
> "Your jobs overview:
> - Daily ETL Job: Last run succeeded (2 hours ago), duration 15 min
> - Weekly Report: Running now, started 5 minutes ago
> - Data Quality Check: Failed 1 hour ago with error 'Table not found'
>
> Would you like me to investigate the failed job?"
### Example 9: Build Multi-Task Job
**User Query:**
> "Create a job with 3 tasks: extract data, transform it, then load to warehouse. Each task depends on the previous one."
**Workflow:**
AI creates a job with task dependencies:
1. Task 1: Extract (no dependencies)
2. Task 2: Transform (depends on Extract)
3. Task 3: Load (depends on Transform)
## Data Management
### Example 10: Explore Unity Catalog
**User Query:**
> "Show me what's in my Unity Catalog"
**Workflow:**
1. AI calls `list_catalogs`
2. For each catalog, AI calls `list_schemas`
3. For key schemas, AI calls `list_tables`
**Response:**
> "Your Unity Catalog contains:
>
> **production** catalog:
> - sales schema: 12 tables (customers, orders, products, ...)
> - marketing schema: 8 tables
> - finance schema: 15 tables
>
> **development** catalog:
> - sandbox schema: 23 tables
> - testing schema: 5 tables
>
> Total: 63 tables across 2 catalogs"
### Example 11: Query Data
**User Query:**
> "Show me the top 10 customers by revenue from the sales.customers table"
**Workflow:**
AI calls `query_table` with:
```json
{
"warehouse_id": "sql-warehouse-id",
"query": "SELECT customer_name, SUM(revenue) as total_revenue FROM production.sales.customers GROUP BY customer_name ORDER BY total_revenue DESC LIMIT 10"
}
```
**Response:**
Returns formatted query results with top 10 customers.
### Example 12: Table Metadata
**User Query:**
> "What's the schema of the production.sales.orders table?"
**Workflow:**
AI calls `get_table_details` for full metadata
**Response:**
> "The production.sales.orders table has:
> - Type: MANAGED (Delta)
> - Columns: order_id (bigint), customer_id (bigint), order_date (date), amount (decimal), status (string)
> - Owner: data-engineering-team
> - Size: 2.5 TB
> - Last modified: 2 hours ago
> - Partitioned by: order_date"
## User & Access Management
### Example 13: Onboard New Team Member
**User Query:**
> "Add jane.smith@company.com to the workspace and add her to the data-analysts group"
**Workflow:**
1. AI calls `create_user` with user details
2. AI calls `list_groups` to find "data-analysts"
3. AI calls `add_user_to_group`
**Response:**
> "Successfully onboarded Jane Smith:
> - Created user account (ID: user-123)
> - Added to 'data-analysts' group
> - She now has access to: development catalog, analyst clusters, shared notebooks"
### Example 14: Audit User Permissions
**User Query:**
> "What permissions does john.doe@company.com have on the production catalog?"
**Workflow:**
1. AI calls `list_users` to find user ID
2. AI calls `get_effective_permissions` for production catalog
**Response:**
> "John Doe has these permissions on production catalog:
> - SELECT on all schemas
> - USAGE on catalog
> - Inherited from: data-engineering-team group
>
> He can read all tables but cannot create, modify, or delete objects."
### Example 15: Grant Permissions
**User Query:**
> "Give the analytics-team group SELECT access to the production.sales schema"
**Workflow:**
AI calls `grant_unity_catalog_permissions`:
```json
{
"securable_type": "schema",
"securable_name": "production.sales",
"principal": "analytics-team",
"privileges": ["USAGE", "SELECT"]
}
```
## Cost Optimization
### Example 16: Monthly Cost Analysis
**User Query:**
> "How much did we spend on Databricks last month?"
**Workflow:**
AI calls `get_cluster_costs` with last month's date range
**Response:**
> "Databricks spending for January 2024:
> - Total Cost: $12,450
> - Total DBUs: 83,000
> - Clusters: 15 active
>
> Top 5 most expensive clusters:
> 1. production-etl: $4,200 (34% of total)
> 2. ml-training: $3,100 (25%)
> 3. analytics-cluster: $1,800 (14%)
> 4. dev-cluster-1: $950 (8%)
> 5. reporting: $720 (6%)"
### Example 17: Spending Trends
**User Query:**
> "Show me our weekly spending trend for the last 3 months"
**Workflow:**
AI calls `analyze_spending_trends` with 90-day range and weekly granularity
**Response:**
> "Spending trends (last 12 weeks):
> - Average weekly cost: $2,850
> - Highest week: Week of Jan 15 ($4,200) - ML model training
> - Lowest week: Week of Dec 25 ($1,100) - holiday period
> - Trend: Increasing 8% month-over-month
> - Forecast: If trend continues, expect $13,500 next month"
### Example 18: Cost Optimization Recommendations
**User Query:**
> "How can we reduce our Databricks costs?"
**Workflow:**
AI calls `get_optimization_recommendations`
**Response:**
> "Found 8 cost optimization opportunities:
>
> **HIGH PRIORITY** (Est. savings: $2,400/month)
> 1. Idle Cluster: 'old-ml-cluster' has been running idle for 72 hours
> → Action: Terminate and enable auto-termination
> → Savings: $1,200/month
>
> 2. Oversized Cluster: 'analytics-cluster' configured for 16 workers but avg usage is 4
> → Action: Enable autoscaling with max 8 workers
> → Savings: $800/month
>
> **MEDIUM PRIORITY** (Est. savings: $1,100/month)
> 3. Missing Autoscaling: 5 clusters with fixed worker counts
> → Action: Enable autoscaling
> → Savings: $600/month
>
> 4. Long Auto-termination: 3 clusters set to never terminate
> → Action: Set 60-minute auto-termination
> → Savings: $500/month
>
> **Total Potential Savings: $3,500/month (28% reduction)**"
### Example 19: Cost by Team
**User Query:**
> "Break down costs by team using our 'team' tags"
**Workflow:**
AI calls `analyze_cost_by_tag` with tag_key="team"
**Response:**
> "Cost allocation by team (January 2024):
>
> 1. Data Engineering: $5,200 (42%)
> 2. Data Science: $3,800 (31%)
> 3. Analytics: $2,100 (17%)
> 4. Untagged: $1,350 (10%)
>
> Recommendation: Tag all clusters to improve cost tracking. 10% of spend is currently unallocated."
### Example 20: Cost Forecasting
**User Query:**
> "What will our costs look like next quarter?"
**Workflow:**
AI calls `forecast_costs` with 90-day forecast
**Response:**
> "90-day cost forecast (Apr-Jun 2024):
>
> Based on the last 30 days:
> - April: $13,200 (±$2,640)
> - May: $13,800 (±$2,760)
> - June: $14,400 (±$2,880)
> - **Total Q2: ~$41,400**
>
> Trend: 5% monthly growth
>
> Confidence: Medium (model based on recent patterns)
>
> Notes:
> - Forecast assumes current usage patterns continue
> - Seasonal variations may affect accuracy
> - Consider optimization recommendations to reduce projected costs"
## Advanced Workflows
### Example 21: Automated Data Pipeline Setup
**User Query:**
> "Set up a complete data pipeline: create a cluster, create notebooks for bronze/silver/gold layers, and schedule a daily job"
**Workflow:**
1. AI calls `create_cluster` for ETL cluster
2. AI calls `create_directory` for pipeline folder
3. AI calls `create_notebook` 3 times (bronze, silver, gold)
4. AI calls `create_job` with multi-task workflow
5. AI calls `set_permissions` to grant team access
**Response:**
> "Complete data pipeline created:
>
> **Infrastructure:**
> - ETL Cluster: 'pipeline-cluster' (2-8 workers, auto-scaling)
> - Notebooks: /pipelines/medallion/bronze, silver, gold
>
> **Job:** 'Daily Medallion Pipeline'
> - Task 1: Bronze layer (raw data ingestion)
> - Task 2: Silver layer (cleaning & transformation)
> - Task 3: Gold layer (business aggregations)
> - Schedule: Daily at 1:00 AM
> - Notifications: Sent to data-eng-team@company.com
>
> **Permissions:**
> - data-engineering: CAN_MANAGE
> - data-analysts: CAN_VIEW
>
> Pipeline is ready to run!"
### Example 22: Compliance Audit
**User Query:**
> "Audit who has access to our sensitive PII table and ensure only authorized users have access"
**Workflow:**
1. AI calls `get_table_details` for PII table
2. AI calls `get_permissions` for table
3. AI calls `get_effective_permissions` for each principal
4. AI compares against authorized list
5. AI calls `revoke_unity_catalog_permissions` for unauthorized users
**Response:**
> "PII Table Access Audit:
>
> **Authorized Access:**
> - compliance-team: SELECT (correct)
> - data-governance: ALL_PRIVILEGES (correct)
>
> **Issues Found:**
> - analytics-team: Had SELECT access (REMOVED)
> - john.doe@company.com: Had MODIFY access (REMOVED)
>
> **Actions Taken:**
> - Revoked 2 unauthorized permissions
> - PII table now compliant with access policy
> - Audit log entry created"
### Example 23: Disaster Recovery Test
**User Query:**
> "Back up all critical job configurations so we can restore them if needed"
**Workflow:**
1. AI calls `list_jobs`
2. For each job, AI calls `get_job_details`
3. AI formats as JSON backup
4. AI provides restore instructions
**Response:**
> "Backed up 15 job configurations:
> - Production jobs: 8
> - Development jobs: 7
>
> Backup includes:
> - Job settings and schedules
> - Task configurations
> - Cluster specifications
> - Notification settings
>
> To restore a job:
> 1. Use `create_job` tool with the backed-up configuration
> 2. Update cluster IDs if clusters were recreated
> 3. Verify schedules and notifications"
### Example 24: Resource Right-Sizing
**User Query:**
> "Analyze my cluster usage and recommend optimal configurations"
**Workflow:**
1. AI calls `list_clusters`
2. AI calls `get_cluster_events` for each cluster
3. AI analyzes uptime, idle time, and patterns
4. AI calls `get_optimization_recommendations`
**Response:**
> "Cluster right-sizing analysis:
>
> **analytics-cluster:**
> - Current: 16 fixed workers
> - Usage pattern: Peaks at 8 workers during business hours, idle nights/weekends
> - Recommendation: Autoscale 2-8 workers + auto-termination
> - Impact: 60% cost reduction ($1,200/month savings)
>
> **ml-training:**
> - Current: 32 fixed workers, runs 24/7
> - Usage pattern: Training jobs 3 times per week, 4 hours each
> - Recommendation: Use job clusters instead of always-on cluster
> - Impact: 90% cost reduction ($3,600/month savings)
>
> **data-engineering:**
> - Current: 4-16 workers autoscaling
> - Usage pattern: Consistent usage, well-configured
> - Recommendation: No changes needed
> - Impact: Already optimized
>
> Total potential savings: $4,800/month"
### Example 25: Security Hardening
**User Query:**
> "Review and harden security across my workspace"
**Workflow:**
1. AI calls `list_users` and `list_groups`
2. AI calls `list_clusters` and checks configurations
3. AI calls `get_permissions` for sensitive resources
4. AI provides security recommendations
**Response:**
> "Security audit completed:
>
> **User Access:**
> - 45 active users, 8 groups
> - 3 users with admin privileges (review recommended)
> - 2 inactive users with active tokens (remove recommended)
>
> **Cluster Security:**
> - 12 clusters without encryption enabled (enable recommended)
> - 5 clusters allowing unrestricted network access (restrict recommended)
>
> **Data Access:**
> - 3 tables with public access (review required)
> - Unity Catalog enabled (good)
> - Row/column filtering not used (consider for PII)
>
> **Recommendations:**
> 1. Enable encryption on all clusters
> 2. Implement network isolation
> 3. Review admin privileges
> 4. Remove inactive user tokens
> 5. Implement row-level security on PII tables
>
> Would you like me to implement these security improvements?"
## Tips for Effective Use
### Be Specific
Instead of: "Show me clusters"
Better: "Show me all running clusters and their costs for the last 7 days"
### Provide Context
Instead of: "Create a job"
Better: "Create a daily job that runs my ETL notebook at 2 AM and emails me if it fails"
### Chain Operations
You can ask for complex workflows:
"Create a development environment: set up a cluster, create project folders, and grant access to my team"
### Ask for Analysis
Don't just get data, ask for insights:
"Analyze my costs and tell me where I can save money"
### Follow Up
The AI maintains context, so you can refine:
1. "Show me my clusters"
2. "Which ones are idle?"
3. "Terminate the idle ones"
## Common Patterns
### Pattern 1: Create → Configure → Test
```
1. Create a cluster
2. Configure with specific settings
3. Test by running a notebook
```
### Pattern 2: List → Filter → Act
```
1. List all resources
2. Filter by criteria (idle, expensive, etc.)
3. Take action (terminate, modify, etc.)
```
### Pattern 3: Monitor → Analyze → Optimize
```
1. Get current metrics
2. Analyze trends and patterns
3. Apply optimizations
```
### Pattern 4: Backup → Modify → Verify
```
1. Get current configuration
2. Make changes
3. Verify changes took effect
```
These examples demonstrate the power and flexibility of the Databricks MCP Server. The AI can chain multiple tools together to accomplish complex workflows automatically!