# Databricks MCP Server - Streamlit Demo
This is a demo Streamlit application that showcases the Databricks MCP Server integration with Claude AI.
## Features
- π¬ **Interactive Chat Interface**: Natural language interaction with your Databricks workspace
- π οΈ **Tool Integration**: Real-time execution of Databricks operations
- π **Visual Feedback**: See what tools are being called and their results
- π‘ **Example Queries**: Pre-built example queries to get started quickly
- π¨ **Beautiful UI**: Clean, modern interface built with Streamlit
## Prerequisites
- Python 3.8 or higher
- Node.js 18 or higher (for MCP server)
- Anthropic API key
- Databricks workspace and access token
## Installation
### 1. Install the Databricks MCP Server
From the parent directory:
```bash
npm install
npm run build
```
### 2. Install Python Dependencies
```bash
cd demo
pip install -r requirements.txt
```
### 3. Configure Environment Variables
Create a `.env` file in the demo directory:
```env
# Anthropic API Key
ANTHROPIC_API_KEY=your-anthropic-api-key
# Databricks Configuration
DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
DATABRICKS_TOKEN=your-databricks-token
```
## Running the Demo
### Start the Streamlit App
```bash
streamlit run app.py
```
The app will open in your browser at `http://localhost:8501`
## Usage
### Example Queries
Try these example queries:
**Cluster Management:**
- "List all running clusters"
- "Create a new cluster with 4 workers"
- "Show me idle clusters and terminate them"
**Cost Analysis:**
- "What were my costs last month?"
- "Show spending trends for the last quarter"
- "Give me cost optimization recommendations"
**Notebook Operations:**
- "List all notebooks in my workspace"
- "Create a new Python notebook for data analysis"
- "Run the ETL notebook with yesterday's date"
**Job Management:**
- "Show all scheduled jobs"
- "What's the status of job 12345?"
- "Create a daily job to run my pipeline"
**Data Operations:**
- "List all catalogs in Unity Catalog"
- "Show me tables in the sales schema"
- "Query the top 10 customers by revenue"
### Understanding the Interface
**Sidebar:**
- **Configuration**: Manage API keys and settings
- **Example Queries**: Click to try pre-built queries
- **Settings**: Adjust AI temperature and token limits
**Main Chat:**
- Type your questions in natural language
- See the AI's responses with structured data
- Expand "Tool Calls" to see what operations were performed
**Tool Calls:**
When you expand the tool calls section, you'll see:
- Which Databricks APIs were called
- What parameters were used
- The raw results returned
## Architecture
```
βββββββββββββββββββ
β Streamlit UI β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β Anthropic API β (Claude AI)
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β MCP Server β (Model Context Protocol)
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββ
β Databricks API β
βββββββββββββββββββ
```
1. **User** enters a query in Streamlit
2. **Streamlit** sends the query to Claude via Anthropic API
3. **Claude** uses the MCP server to execute Databricks operations
4. **MCP Server** calls Databricks APIs
5. **Results** flow back through the stack to the UI
## Features in Detail
### Natural Language Processing
The AI understands context and can:
- Interpret vague queries and ask for clarification
- Chain multiple operations together
- Provide insights and recommendations
- Format results in human-readable ways
### Tool Visibility
The app shows:
- Which tools are being called
- Input parameters for each tool
- Raw API responses
- Processing steps
### Error Handling
The app gracefully handles:
- API errors
- Invalid queries
- Missing permissions
- Network issues
## Customization
### Adding Custom Queries
Edit the `examples` list in `app.py`:
```python
examples = [
"π Your custom query here",
# ... more examples
]
```
### Styling
Modify the CSS in the `st.markdown()` section at the top of `app.py`:
```python
st.markdown("""
<style>
.your-custom-class {
/* your styles */
}
</style>
""", unsafe_allow_html=True)
```
### Changing AI Parameters
Adjust in the sidebar:
- **Temperature**: Controls randomness (0 = deterministic, 1 = creative)
- **Max Tokens**: Maximum response length
## Troubleshooting
### "API Key not configured"
Make sure your `.env` file contains:
```env
ANTHROPIC_API_KEY=your-key-here
```
### "Databricks token not configured"
Add to `.env`:
```env
DATABRICKS_HOST=https://your-workspace.cloud.databricks.com
DATABRICKS_TOKEN=your-token
```
### "MCP Server not responding"
1. Make sure the MCP server is installed:
```bash
npm install
npm run build
```
2. Check that environment variables are set correctly
3. Verify network connectivity to Databricks
### Import Errors
Make sure all dependencies are installed:
```bash
pip install -r requirements.txt
```
## Advanced Usage
### Running with Docker
Create a `Dockerfile`:
```dockerfile
FROM python:3.11-slim
WORKDIR /app
# Install Node.js for MCP server
RUN apt-get update && apt-get install -y nodejs npm
# Copy and install dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
# Install MCP server
WORKDIR /app/..
RUN npm install && npm run build
WORKDIR /app
CMD ["streamlit", "run", "app.py"]
```
Build and run:
```bash
docker build -t databricks-chat .
docker run -p 8501:8501 --env-file .env databricks-chat
```
### Deployment
To deploy to Streamlit Cloud:
1. Push your code to GitHub
2. Go to [share.streamlit.io](https://share.streamlit.io)
3. Connect your repository
4. Add secrets in the Streamlit Cloud dashboard:
- `ANTHROPIC_API_KEY`
- `DATABRICKS_HOST`
- `DATABRICKS_TOKEN`
## Security Notes
- Never commit `.env` files to version control
- Use environment variables for all secrets
- Rotate API tokens regularly
- Use read-only tokens when possible
- Implement rate limiting for production use
## Contributing
Contributions are welcome! Please:
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Submit a pull request
## License
MIT License - see LICENSE file for details
## Support
For issues or questions:
- GitHub Issues: [Report an issue](https://github.com/yourusername/databricks-mcp-server/issues)
- Documentation: [Main README](../README.md)
## Additional Resources
- [Streamlit Documentation](https://docs.streamlit.io)
- [Anthropic API Docs](https://docs.anthropic.com)
- [Model Context Protocol](https://modelcontextprotocol.io)
- [Databricks API Reference](https://docs.databricks.com/api)
## What's Next?
Try these advanced scenarios:
1. **Build a Dashboard**: Create visualizations of cluster costs
2. **Automate Workflows**: Set up scheduled operations
3. **Create Alerts**: Monitor costs and send notifications
4. **Integrate with Slack**: Send Databricks info to Slack
5. **Generate Reports**: Create automated reports on usage
Have fun exploring your Databricks workspace with AI! π