README.md•24.2 kB
# DP-MCP Server
A comprehensive Data Platform MCP (Model Context Protocol) server that provides seamless integration between PostgreSQL databases and MinIO object storage. Built with FastMCP 2.0 for modern AI applications.
## 🚀 Features
### PostgreSQL Operations
- **Query Execution**: Execute SQL queries with results formatting
- **Schema Introspection**: List tables, describe table structures
- **Data Export**: Export table data to CSV format
- **Connection Management**: Robust connection handling with SSL support
### MinIO Object Storage
- **Bucket Management**: Create, list, and manage buckets
- **Object Operations**: Upload, download, list, and delete objects
- **Flexible Configuration**: Support for secure and insecure connections
### Advanced Capabilities
- **Database Backup**: Automated PostgreSQL table backup to MinIO
- **Combined Operations**: Seamless data pipeline between DB and storage
- **FastMCP Integration**: Modern MCP server with HTTP transport
- **Docker Support**: Complete development environment
### 🤖 AI-Powered Features (NEW!)
- **Natural Language Queries**: Ask questions in plain English, get SQL results
- **Intelligent Query Analysis**: AI-powered insights and pattern detection
- **Data Privacy Protection**: Multiple security levels for sensitive data
- **Multi-Model Support**: Claude, GPT, local Ollama models, or mock models
- **Secure by Design**: Data never leaves your environment with local models
## 📋 Table of Contents
- [Quick Start](#quick-start)
- [Installation](#installation)
- [Configuration](#configuration)
- [Available Tools](#available-tools)
- [AI Features](#ai-features)
- [Usage Examples](#usage-examples)
- [CLI Tool](#cli-tool)
- [MCP Client](#mcp-client)
- [API Reference](#api-reference)
- [Development](#development)
- [Deployment](#deployment)
- [Troubleshooting](#troubleshooting)
- [Contributing](#contributing)
## 🏃 Quick Start
### 1. Setup the Project
```bash
# Clone and setup (if not already done)
git clone <repository-url>
cd da.dp-mcp
chmod +x setup.sh
./setup.sh
```
The setup script will:
- Install `uv` package manager if needed
- Install all Python dependencies
- Start Docker services automatically
- Test your configuration
### 2. Configure Environment
```bash
# Copy and edit environment configuration
cp .env.sample .env
# Edit .env with your database and MinIO credentials if needed
```
**Note**: The default configuration works with the included Docker services.
### 3. Start Services
```bash
# Start PostgreSQL and MinIO with Docker
docker-compose down
docker-compose up -d
# Verify services are healthy
docker-compose ps
```
You should see both services as "healthy":
```
NAME STATUS
dp-mcp-postgres Up (healthy)
dp-mcp-minio Up (healthy)
```
### 4. Launch the MCP Server
```bash
# Start with debug logging (recommended for first run)
uv run python src/dp_mcp/server.py --debug
# Or for quiet mode
uv run python src/dp_mcp/server.py
```
**Successful startup looks like:**
```
INFO:__main__:✓ PostgreSQL connection successful
INFO:__main__:✓ MinIO connection successful
INFO:__main__:Starting DP-MCP Server on port 8888
╭─ FastMCP 2.0 ──────────────────────────────────────────────────────────────╮
│ 🖥️ Server name: dp-mcp │
│ 📦 Transport: Streamable-HTTP │
│ 🔗 Server URL: http://127.0.0.1:8888/mcp/ │
╰────────────────────────────────────────────────────────────────────────────╯
```
### 5. Verify Installation
**✅ Server Status**: The server will be available at: `http://127.0.0.1:8888/mcp/`
**✅ Test Connection**:
```bash
# Quick health check (will show "Not Acceptable" - this is expected)
curl -v http://localhost:8888/mcp/
```
**✅ Access Development Services**:
- **MinIO Console**: http://localhost:9001 (minioadmin/minioadmin123)
- **PostgreSQL**: localhost:5432 (dp_mcp_user/dp_mcp_password)
### 🎯 Alternative Running Methods
**Run in Background**:
```bash
# Start server in background
uv run python src/dp_mcp/server.py > server.log 2>&1 &
# Check if running
ps aux | grep dp_mcp
# Stop background server
pkill -f "python src/dp_mcp/server.py"
```
**Custom Host/Port**:
```bash
# Run on different host/port
uv run python src/dp_mcp/server.py --host 0.0.0.0 --port 9000 --debug
```
**Using Entry Point** (if installed):
```bash
# After running setup.sh, you can also use:
dp-mcp --help
dp-mcp --debug
```
### 🔧 Troubleshooting Quick Start
**Services Won't Start**:
```bash
# Check Docker status
docker --version
docker-compose --version
# Restart services
docker-compose restart
# Check service logs
docker-compose logs postgres
docker-compose logs minio
```
**Server Won't Start**:
```bash
# Verify configuration
uv run python src/dp_mcp/utils/config.py
# Check dependencies
uv sync
# Ensure services are running first
docker-compose ps
```
**Connection Issues**:
```bash
# Test database connection
psql -h localhost -U dp_mcp_user -d dp_mcp_dev
# Test MinIO connection
curl http://localhost:9000/minio/health/live
```
### 🎉 You're Ready!
Your DP-MCP server is now running and ready to:
- ✅ Execute PostgreSQL queries and manage database schemas
- ✅ Upload, download, and manage MinIO objects and buckets
- ✅ Perform automated database backups to object storage
- ✅ Serve AI agents through the MCP protocol with 17+ specialized tools
- ✅ **NEW**: Answer natural language questions about your data with AI
- ✅ **NEW**: Generate intelligent insights and analysis reports
### 🤖 Test AI Features
To test the new AI capabilities:
```bash
# Test AI integration (works without API keys)
python test_ai_integration.py
# Start server with AI demo mode
uv run python src/dp_mcp/server.py demo --debug
# Test AI status via MCP client
uv run python mcp_client.py --call get_ai_system_status
```
The server runs until you stop it with `Ctrl+C`. For production deployment, see the [Deployment Guide](docs/DEPLOYMENT_GUIDE.md).
## 📦 Installation
### Prerequisites
- Python 3.10 or higher
- Docker and Docker Compose
- uv package manager (installed automatically by setup.sh)
### Manual Installation
```bash
# Install uv if not available
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install dependencies
uv sync
# Optional: Install AI dependencies for cloud models
uv sync --extra ai
# Install in editable mode
uv pip install -e .
```
### Docker Services
The included `docker-compose.yml` provides:
- PostgreSQL 15 with sample data
- MinIO object storage with web console
- Automatic health checks and initialization
## ⚙️ Configuration
### Environment Variables
Create a `.env` file with the following configuration:
```bash
# PostgreSQL Configuration
POSTGRES_HOST=localhost
POSTGRES_PORT=5432
POSTGRES_USER=dp_mcp_user
POSTGRES_PASSWORD=dp_mcp_password
POSTGRES_DATABASE=dp_mcp_dev
POSTGRES_SSL_MODE=prefer
# MinIO Configuration
MINIO_ENDPOINT=localhost:9000
MINIO_ACCESS_KEY=minioadmin
MINIO_SECRET_KEY=minioadmin123
MINIO_SECURE=false
MINIO_DEFAULT_BUCKET=default-bucket
# MCP Server Configuration
MCP_SERVER_HOST=127.0.0.1
MCP_SERVER_PORT=8888
DEBUG=false
```
### Configuration Validation
```bash
# Test configuration
uv run python src/dp_mcp/utils/config.py
```
## 🛠️ Available Tools
The server provides 17+ MCP tools for database, object storage, and AI-powered operations:
### PostgreSQL Tools
| Tool | Description | Parameters |
|------|-------------|------------|
| `execute_sql_query` | Execute SQL queries with formatting | `query` (string), `limit` (int, default: 1000) |
| `list_db_tables` | List all tables in a schema | `schema` (string, default: "public") |
| `describe_db_table` | Get table structure details | `table_name` (string), `schema` (string, default: "public") |
| `export_table_csv` | Export table data as CSV | `table_name` (string), `limit` (int, default: 10000), `where_clause` (string, optional) |
### MinIO Object Storage Tools
| Tool | Description | Parameters |
|------|-------------|------------|
| `list_minio_buckets` | List all available buckets | None |
| `list_bucket_objects` | List objects in a bucket | `bucket_name` (string), `prefix` (string, optional), `max_keys` (int, default: 1000) |
| `upload_to_minio` | Upload data to object store | `bucket_name` (string), `object_name` (string), `data` (string), `content_type` (string, default: "text/plain") |
| `download_from_minio` | Download object from storage | `bucket_name` (string), `object_name` (string) |
| `create_minio_bucket` | Create a new bucket | `bucket_name` (string), `region` (string, optional) |
| `delete_minio_object` | Delete an object | `bucket_name` (string), `object_name` (string) |
### Combined Operations
| Tool | Description | Parameters |
|------|-------------|------------|
| `backup_table_to_minio` | Backup PostgreSQL table to MinIO | `table_name` (string), `bucket_name` (string, default: "backups"), `schema` (string, default: "public"), `limit` (int, default: 10000), `where_clause` (string, optional) |
### 🤖 AI-Enhanced Tools
| Tool | Description | Parameters |
|------|-------------|------------|
| `ask_natural_language_query` | Convert natural language to SQL and execute with AI analysis | `question` (string), `schema` (string, default: "public"), `model_name` (string, optional) |
| `explain_query_with_ai` | Execute SQL and get AI-powered explanation | `sql_query` (string), `limit` (int, default: 100), `model_name` (string, optional) |
| `get_ai_data_insights` | Generate AI suggestions for database analysis | `schema` (string, default: "public"), `model_name` (string, optional) |
| `analyze_table_patterns` | AI analysis of data patterns and quality | `table_name` (string), `schema` (string, default: "public"), `sample_size` (int, default: 1000), `model_name` (string, optional) |
| `generate_ai_data_report` | Comprehensive AI-powered data reports | `title` (string), `tables` (string, comma-separated), `schema` (string, default: "public"), `model_name` (string, optional) |
| `get_ai_system_status` | Get AI system status and configuration | None |
## 🤖 AI Features
### Secure AI Integration
The DP-MCP server now includes comprehensive AI capabilities designed with security and privacy as top priorities:
**🔒 Privacy Levels**:
- **None**: No filtering (public data only)
- **Basic**: Remove obvious PII (emails, phones)
- **Moderate**: Mask patterns, limit data size
- **Strict**: Schema-only mode, heavy sanitization
- **Paranoid**: No actual data sent to AI
**🤖 Supported Models**:
- **Claude (Anthropic)**: claude-3-sonnet, claude-3-haiku
- **OpenAI**: gpt-4, gpt-3.5-turbo
- **Local Models**: Ollama (llama2, codellama, mistral)
- **Mock Models**: Safe testing without API calls
**🛡️ Security Features**:
- Data sanitization and pattern masking
- SQL injection detection
- Audit logging for all AI interactions
- Local-only processing with Ollama
- Environment-based configurations
### Quick AI Setup
**🎯 Smart Auto-Configuration** - No command-line parameters needed! The server automatically detects and uses whatever AI models you have available.
1. **Demo Mode** (Works immediately):
```bash
# Just start the server - it works out of the box with mock AI
uv run python src/dp_mcp/server.py --debug
```
2. **Local Models** (Recommended for security):
```bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh
ollama serve
# Install models (choose based on your needs):
ollama pull phi3 # 2.2 GB - Fast, efficient
ollama pull mistral # 4.1 GB - Balanced performance
# Start server - automatically detects and uses Ollama models
uv run python src/dp_mcp/server.py --debug
```
3. **Cloud Models** (Optional - add API keys for premium models):
```bash
# Configure your API keys in .env.ai:
# Replace CLAUDE_API_KEY=sk-ant-api03-XXXX... with your real key
# Replace OPENAI_API_KEY=sk-XXXX... with your real key
# Start server - automatically detects and uses all available models
uv run python src/dp_mcp/server.py --debug
```
**✨ Benefits of Auto-Configuration:**
- **No complex parameters** - Just start with `uv run python src/dp_mcp/server.py`
- **Smart detection** - Uses Ollama models, cloud APIs, or demo mode automatically
- **User-friendly** - Configure once in `.env.ai`, works everywhere
- **Flexible** - Mix local and cloud models as needed
### AI Usage Examples
```python
# Natural language query
result = await client.call_tool("ask_natural_language_query", {
"question": "How many users signed up last month?"
})
# AI query explanation
result = await client.call_tool("explain_query_with_ai", {
"sql_query": "SELECT COUNT(*) FROM users WHERE created_at >= '2024-01-01'"
})
# Get AI insights
result = await client.call_tool("get_ai_data_insights", {
"schema": "public"
})
# Analyze table patterns
result = await client.call_tool("analyze_table_patterns", {
"table_name": "users",
"sample_size": 1000
})
# Generate comprehensive report
result = await client.call_tool("generate_ai_data_report", {
"title": "Monthly User Analysis",
"tables": "users,user_activity,subscriptions"
})
```
## 💡 Usage Examples
### Basic Database Operations
```python
# Connect to the MCP server and execute queries
import mcp
# List all tables
result = await client.call_tool("list_db_tables", {"schema": "public"})
# Describe a specific table
result = await client.call_tool("describe_db_table", {
"table_name": "users",
"schema": "public"
})
# Execute a custom query
result = await client.call_tool("execute_sql_query", {
"query": "SELECT * FROM users WHERE is_active = true",
"limit": 50
})
# Export table data
result = await client.call_tool("export_table_csv", {
"table_name": "orders",
"where_clause": "order_date >= '2025-01-01'",
"limit": 1000
})
```
### Object Storage Operations
```python
# List all buckets
result = await client.call_tool("list_minio_buckets", {})
# Create a new bucket
result = await client.call_tool("create_minio_bucket", {
"bucket_name": "my-data-lake",
"region": "us-east-1"
})
# Upload data
result = await client.call_tool("upload_to_minio", {
"bucket_name": "my-data-lake",
"object_name": "data/sample.json",
"data": '{"key": "value"}',
"content_type": "application/json"
})
# List objects in a bucket
result = await client.call_tool("list_bucket_objects", {
"bucket_name": "my-data-lake",
"prefix": "data/",
"max_keys": 100
})
# Download an object
result = await client.call_tool("download_from_minio", {
"bucket_name": "my-data-lake",
"object_name": "data/sample.json"
})
```
### Advanced: Database Backup to Object Storage
```python
# Backup entire table to MinIO
result = await client.call_tool("backup_table_to_minio", {
"table_name": "users",
"bucket_name": "backups",
"schema": "public",
"limit": 50000
})
# Backup with filtering
result = await client.call_tool("backup_table_to_minio", {
"table_name": "orders",
"bucket_name": "data-lake",
"where_clause": "created_at >= '2025-01-01'",
"limit": 100000
})
```
## 🖥️ CLI Tool
The DP-MCP CLI tool provides direct command-line access to all PostgreSQL and MinIO operations without requiring the MCP protocol overhead.
### Quick CLI Usage
```bash
# List all database tables
./dp-cli list-tables
# Describe a table structure
./dp-cli describe-table users
# Execute SQL queries
./dp-cli query "SELECT * FROM users LIMIT 5"
# Export table to CSV
./dp-cli export-csv users --limit 100 --output users.csv
# List MinIO buckets
./dp-cli list-buckets
# Upload data to MinIO
./dp-cli upload default-bucket hello.txt "Hello, World!"
# Download from MinIO
./dp-cli download default-bucket hello.txt --output downloaded.txt
# Backup database table to MinIO
./dp-cli backup-table users --bucket backups --limit 1000
```
### CLI Features
✅ **11 Commands**: Direct access to all server tools
✅ **Colored Output**: Beautiful terminal interface
✅ **File Operations**: Upload/download from files
✅ **Batch Processing**: Perfect for scripting
✅ **Error Handling**: Comprehensive error messages
**📖 Full CLI Documentation**: [CLI Reference Guide](docs/CLI_REFERENCE.md)
## 🔌 MCP Client
The DP-MCP client enables MCP protocol communication for integration with AI agents.
> **Note**: The MCP client is currently in development for full FastMCP compatibility. For immediate testing and usage, we recommend using the [CLI Tool](#cli-tool) which provides direct access to all functionality.
### Basic MCP Client Usage
```bash
# Test server connectivity
uv run python mcp_client.py --ping
# For full functionality, use the CLI tool instead:
./dp-cli list-tables
./dp-cli describe-table users
```
### AI Agent Integration
For AI agents and applications that need MCP protocol access, the server is available at:
- **Server URL**: `http://127.0.0.1:8888/mcp/`
- **Protocol**: JSON-RPC 2.0 over HTTP with Server-Sent Events
- **Transport**: FastMCP Streamable HTTP
### MCP Client Status
🚧 **In Development**: Full MCP protocol client
✅ **Session Management**: Basic connectivity established
✅ **Server Discovery**: Endpoint detection working
⏳ **Tool Calling**: FastMCP protocol compatibility in progress
**Recommended**: Use the [CLI Tool](#cli-tool) for immediate access to all 11 tools and full functionality.
**📖 Full MCP Client Documentation**: [MCP Client Guide](docs/MCP_CLIENT_GUIDE.md)
## 📚 API Reference
### Server Endpoints
The FastMCP server provides the following endpoints:
- **Base URL**: `http://127.0.0.1:8888/mcp/`
- **Transport**: Streamable HTTP (Server-Sent Events)
- **Protocol**: MCP (Model Context Protocol) v1.0
### Connection Information
```bash
# Server status and available tools
GET http://127.0.0.1:8888/mcp/
# Tool execution (via MCP client)
POST http://127.0.0.1:8888/mcp/
```
### Error Handling
All tools implement comprehensive error handling:
- **Database Errors**: Connection failures, SQL syntax errors, permission issues
- **Storage Errors**: Bucket not found, object not found, access denied
- **Validation Errors**: Invalid parameters, missing required fields
- **System Errors**: Network issues, timeout errors
### Response Format
All responses follow the MCP protocol format:
```json
{
"jsonrpc": "2.0",
"id": "request-id",
"result": {
"content": [
{
"type": "text",
"text": "Tool execution result"
}
]
}
}
```
## 🔧 Development
### Project Structure
```
da.dp-mcp/
├── src/dp_mcp/
│ ├── __init__.py
│ ├── server.py # Main FastMCP server
│ ├── tools/
│ │ ├── __init__.py
│ │ ├── postgres_tools.py # PostgreSQL operations
│ │ └── minio_tools.py # MinIO operations
│ └── utils/
│ ├── __init__.py
│ └── config.py # Configuration management
├── tests/
│ ├── __init__.py
│ └── test_server.py # Test suite
├── docker-compose.yml # Development services
├── init-db.sql # Sample database schema
├── setup.sh # Project setup script
├── pyproject.toml # Python project configuration
├── .env.sample # Environment template
└── README.md # This file
```
### Running Tests
```bash
# Install development dependencies
uv sync --dev
# Run tests
uv run pytest
# Run with coverage
uv run pytest --cov=src/dp_mcp
# Run specific test
uv run pytest tests/test_server.py::test_postgres_connection
```
### Code Quality
```bash
# Format code
uv run black src/ tests/
# Sort imports
uv run isort src/ tests/
# Type checking
uv run mypy src/
# Linting
uv run flake8 src/ tests/
```
### Development Services
Access the development environment:
- **PostgreSQL**: `localhost:5432`
- Database: `dp_mcp_dev`
- Username: `dp_mcp_user`
- Password: `dp_mcp_password`
- **MinIO Console**: `http://localhost:9001`
- Username: `minioadmin`
- Password: `minioadmin123`
- **MinIO API**: `http://localhost:9000`
## 🚀 Deployment
### Production Configuration
For production deployment, create a secure `.env` file:
```bash
# Production PostgreSQL
POSTGRES_HOST=your-db-host.com
POSTGRES_PORT=5432
POSTGRES_USER=production_user
POSTGRES_PASSWORD=secure_password
POSTGRES_DATABASE=production_db
POSTGRES_SSL_MODE=require
# Production MinIO
MINIO_ENDPOINT=your-minio-host.com:9000
MINIO_ACCESS_KEY=production_access_key
MINIO_SECRET_KEY=secure_secret_key
MINIO_SECURE=true
# Production Server
MCP_SERVER_HOST=0.0.0.0
MCP_SERVER_PORT=8888
DEBUG=false
```
### Docker Production Build
```bash
# Build production image
docker build -t dp-mcp-server .
# Run in production
docker run -d \
--name dp-mcp-server \
--env-file .env \
-p 8888:8888 \
dp-mcp-server
```
### Systemd Service
Create `/etc/systemd/system/dp-mcp.service`:
```ini
[Unit]
Description=DP-MCP Server
After=network.target
[Service]
Type=simple
User=mcp
WorkingDirectory=/opt/dp-mcp
Environment=PATH=/opt/dp-mcp/.venv/bin
ExecStart=/opt/dp-mcp/.venv/bin/python src/dp_mcp/server.py
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
```
### Health Checks
```bash
# Check server health
curl -H "Accept: text/event-stream" http://localhost:8888/mcp/
# Monitor logs
journalctl -u dp-mcp -f
```
## 🔍 Troubleshooting
### Common Issues
**Connection Refused Errors**
```bash
# Check if services are running
docker-compose ps
# Restart services
docker-compose restart
# Check logs
docker-compose logs postgres
docker-compose logs minio
```
**Authentication Errors**
```bash
# Verify environment variables
uv run python src/dp_mcp/utils/config.py
# Test database connection
psql -h localhost -U dp_mcp_user -d dp_mcp_dev
# Test MinIO connection
mc alias set local http://localhost:9000 minioadmin minioadmin123
```
**Server Won't Start**
```bash
# Check Python environment
uv run python --version
# Verify dependencies
uv sync
# Run with debug logging
uv run python src/dp_mcp/server.py --debug
```
### Debug Mode
Enable comprehensive logging:
```bash
# Start with full debug output
uv run python src/dp_mcp/server.py --debug --host 0.0.0.0 --port 8888
```
### Performance Tuning
For high-throughput scenarios:
```bash
# Increase connection pool size
export POSTGRES_MAX_CONNECTIONS=20
# Adjust query limits
export DEFAULT_QUERY_LIMIT=5000
# Enable connection pooling
export POSTGRES_POOL_SIZE=10
```
## 🤝 Contributing
### Development Setup
1. Fork the repository
2. Create a feature branch: `git checkout -b feature/amazing-feature`
3. Make your changes and add tests
4. Run the test suite: `uv run pytest`
5. Format your code: `uv run black src/ tests/`
6. Submit a pull request
### Code Standards
- Follow PEP 8 style guidelines
- Add type hints for all functions
- Write comprehensive docstrings
- Include unit tests for new features
- Update documentation for API changes
### Reporting Issues
Please include:
- Python version and operating system
- Complete error messages and stack traces
- Steps to reproduce the issue
- Configuration details (without sensitive data)
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🙏 Acknowledgments
- [FastMCP](https://gofastmcp.com) - Modern MCP server framework
- [Model Context Protocol](https://spec.modelcontextprotocol.io/) - Standard protocol for AI tool integration
- [PostgreSQL](https://postgresql.org) - Advanced open source database
- [MinIO](https://min.io) - High performance object storage
---
For more information, visit the [project documentation](https://github.com/your-org/da.dp-mcp) or contact the development team.