ENTERPRISE_DEPLOYMENT.md•17.3 kB
# Enterprise Deployment Guide
This guide covers deploying and configuring the GCP MCP Server for enterprise environments with high availability, security, and scalability requirements.
## Architecture Overview
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ AI Assistant │ │ Load Balancer │ │ GCP MCP Server │
│ (Claude Code) │◄──►│ (Optional) │◄──►│ Cluster │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│
▼
┌─────────────────┐
│ Google Cloud │
│ Platform │
└─────────────────┘
```
## Prerequisites
### System Requirements
**Minimum Requirements:**
- CPU: 2 vCPUs
- Memory: 4 GB RAM
- Storage: 20 GB SSD
- Network: 1 Gbps
- OS: Ubuntu 20.04+ / RHEL 8+ / CentOS 8+
**Recommended for Production:**
- CPU: 4-8 vCPUs
- Memory: 8-16 GB RAM
- Storage: 100 GB SSD
- Network: 10 Gbps
- OS: Ubuntu 22.04 LTS
### Software Requirements
- Python 3.8+
- Docker (optional)
- Kubernetes (for container deployment)
- Nginx/HAProxy (for load balancing)
### GCP Requirements
1. **Service Account** with appropriate permissions
2. **APIs Enabled:**
- Cloud Logging API
- Cloud Monitoring API
- Error Reporting API
- Cloud Resource Manager API
- IAM Service Account Credentials API
3. **IAM Roles:**
- `roles/logging.viewer`
- `roles/monitoring.viewer`
- `roles/errorreporting.viewer`
- `roles/resourcemanager.projectViewer`
## Installation Methods
### Method 1: Direct Installation
```bash
# Create dedicated user
sudo useradd -m -s /bin/bash gcpmcp
sudo usermod -aG sudo gcpmcp
# Switch to gcpmcp user
sudo su - gcpmcp
# Clone repository
git clone https://github.com/your-org/gcp-mcp.git
cd gcp-mcp
# Create virtual environment
python3 -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txt
pip install -e .
# Install development dependencies (optional)
pip install -e ".[dev]"
```
### Method 2: Docker Deployment
```dockerfile
# Dockerfile
FROM python:3.11-slim
WORKDIR /app
# Install system dependencies
RUN apt-get update && apt-get install -y \
gcc \
&& rm -rf /var/lib/apt/lists/*
# Copy requirements first for better caching
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# Copy application code
COPY . .
RUN pip install -e .
# Create non-root user
RUN useradd -m -u 1000 gcpmcp
USER gcpmcp
# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=5s --retries=3 \
CMD python -c "import requests; requests.get('http://localhost:8080/health')"
EXPOSE 8080
CMD ["python", "-m", "gcp_mcp.cli"]
```
```bash
# Build and run
docker build -t gcp-mcp:latest .
docker run -d \
--name gcp-mcp \
-p 8080:8080 \
-e GCP_PROJECT=your-project-id \
-e GOOGLE_APPLICATION_CREDENTIALS=/app/credentials.json \
-v /path/to/credentials.json:/app/credentials.json:ro \
-v /path/to/config.json:/app/config.json:ro \
gcp-mcp:latest
```
### Method 3: Kubernetes Deployment
```yaml
# k8s-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: gcp-mcp
labels:
app: gcp-mcp
spec:
replicas: 3
selector:
matchLabels:
app: gcp-mcp
template:
metadata:
labels:
app: gcp-mcp
spec:
serviceAccountName: gcp-mcp-sa
containers:
- name: gcp-mcp
image: gcp-mcp:latest
ports:
- containerPort: 8080
env:
- name: GCP_PROJECT
value: "your-project-id"
- name: LOG_LEVEL
value: "INFO"
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "2Gi"
cpu: "1000m"
livenessProbe:
httpGet:
path: /health
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
volumeMounts:
- name: config
mountPath: /app/config.json
subPath: config.json
- name: gcp-credentials
mountPath: /app/credentials.json
subPath: credentials.json
volumes:
- name: config
configMap:
name: gcp-mcp-config
- name: gcp-credentials
secret:
secretName: gcp-mcp-credentials
---
apiVersion: v1
kind: Service
metadata:
name: gcp-mcp-service
spec:
selector:
app: gcp-mcp
ports:
- protocol: TCP
port: 80
targetPort: 8080
type: ClusterIP
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: gcp-mcp-sa
annotations:
iam.gke.io/gcp-service-account: gcp-mcp@your-project.iam.gserviceaccount.com
```
## Configuration
### Enterprise Configuration File
```json
{
"default_project": "your-project-id",
"log_retention_days": 90,
"max_results": 10000,
"excluded_log_names": [
"projects/your-project/logs/cloudaudit.googleapis.com%2Fdata_access"
],
"authentication": {
"method": "service_account",
"service_account_path": "/path/to/service-account.json",
"project_id": "your-project-id"
},
"logging": {
"level": "INFO",
"format": "json",
"file": "/var/log/gcp-mcp/server.log",
"max_size_mb": 100,
"backup_count": 5
},
"cache": {
"enabled": true,
"ttl_seconds": 300,
"max_size_mb": 512,
"redis_url": "redis://localhost:6379/0"
},
"rate_limiting": {
"enabled": true,
"queries_per_minute": 1000,
"queries_per_hour": 10000,
"projects_per_hour": 100,
"concurrent_requests": 50
},
"security": {
"enable_audit_logging": true,
"max_query_time_hours": 168,
"allowed_resource_types": [
"gce_instance",
"k8s_container",
"cloud_function",
"gae_app"
],
"blocked_log_names": [
"projects/*/logs/cloudaudit.googleapis.com%2Fdata_access"
]
},
"monitoring": {
"enabled": true,
"metrics_port": 9090,
"health_check_port": 8080,
"export_interval_seconds": 60
},
"enterprise": {
"multi_project_enabled": true,
"max_projects_per_query": 20,
"enable_compliance_mode": true,
"data_retention_policy": "90d",
"encryption_at_rest": true,
"audit_all_queries": true
}
}
```
### Environment Variables
```bash
# Core Configuration
export GCP_PROJECT="your-project-id"
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account.json"
export GCP_MCP_CONFIG="/path/to/config.json"
# Logging
export LOG_LEVEL="INFO"
export LOG_FORMAT="json"
# Cache Configuration
export CACHE_ENABLED="true"
export CACHE_TTL_SECONDS="300"
export CACHE_MAX_SIZE_MB="512"
export REDIS_URL="redis://localhost:6379/0"
# Rate Limiting
export RATE_LIMIT_ENABLED="true"
export RATE_LIMIT_QPM="1000"
export RATE_LIMIT_QPH="10000"
# Security
export ENABLE_AUDIT_LOGGING="true"
export MAX_QUERY_TIME_HOURS="168"
# Monitoring
export METRICS_ENABLED="true"
export METRICS_PORT="9090"
export HEALTH_CHECK_PORT="8080"
```
## High Availability Setup
### Load Balancer Configuration (Nginx)
```nginx
# /etc/nginx/sites-available/gcp-mcp
upstream gcp_mcp_backend {
least_conn;
server 10.0.1.10:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
}
server {
listen 80;
listen 443 ssl http2;
server_name gcp-mcp.yourdomain.com;
# SSL Configuration
ssl_certificate /etc/ssl/certs/gcp-mcp.crt;
ssl_certificate_key /etc/ssl/private/gcp-mcp.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-RSA-AES256-GCM-SHA512:DHE-RSA-AES256-GCM-SHA512;
# Security Headers
add_header X-Frame-Options DENY;
add_header X-Content-Type-Options nosniff;
add_header X-XSS-Protection "1; mode=block";
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload";
# Rate Limiting
limit_req_zone $binary_remote_addr zone=api:10m rate=10r/s;
limit_req zone=api burst=20 nodelay;
location / {
proxy_pass http://gcp_mcp_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Timeouts
proxy_connect_timeout 30s;
proxy_send_timeout 300s;
proxy_read_timeout 300s;
# Buffer settings
proxy_buffering on;
proxy_buffer_size 128k;
proxy_buffers 4 256k;
proxy_busy_buffers_size 256k;
}
location /health {
proxy_pass http://gcp_mcp_backend/health;
access_log off;
}
location /metrics {
proxy_pass http://gcp_mcp_backend/metrics;
# Restrict access to monitoring systems
allow 10.0.0.0/8;
deny all;
}
}
```
### Redis Configuration for Caching
```redis
# /etc/redis/redis.conf
bind 127.0.0.1 10.0.1.20
port 6379
protected-mode yes
requirepass your-secure-password
# Memory management
maxmemory 2gb
maxmemory-policy allkeys-lru
# Persistence
save 900 1
save 300 10
save 60 10000
# Security
rename-command FLUSHDB ""
rename-command FLUSHALL ""
rename-command DEBUG ""
rename-command CONFIG "CONFIG_9a8b7c6d"
# Logging
loglevel notice
logfile /var/log/redis/redis-server.log
# Performance
tcp-keepalive 300
timeout 0
```
## Security Hardening
### Service Account Security
```bash
# Create minimal service account
gcloud iam service-accounts create gcp-mcp-reader \
--description="GCP MCP Server Read-only Service Account" \
--display-name="GCP MCP Reader"
# Assign minimal required roles
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:gcp-mcp-reader@your-project-id.iam.gserviceaccount.com" \
--role="roles/logging.viewer"
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:gcp-mcp-reader@your-project-id.iam.gserviceaccount.com" \
--role="roles/monitoring.viewer"
# Create and download key
gcloud iam service-accounts keys create gcp-mcp-credentials.json \
--iam-account=gcp-mcp-reader@your-project-id.iam.gserviceaccount.com
```
### Custom IAM Role (Recommended)
```yaml
# custom-role.yaml
title: "GCP MCP Server Role"
description: "Minimal permissions for GCP MCP Server"
stage: "GA"
includedPermissions:
- logging.entries.list
- logging.logMetrics.list
- logging.logs.list
- logging.sinks.list
- monitoring.metricDescriptors.list
- monitoring.timeSeries.list
- resourcemanager.projects.get
- errorreporting.events.list
```
```bash
# Create custom role
gcloud iam roles create gcpMcpServerRole \
--project=your-project-id \
--file=custom-role.yaml
# Assign custom role
gcloud projects add-iam-policy-binding your-project-id \
--member="serviceAccount:gcp-mcp-reader@your-project-id.iam.gserviceaccount.com" \
--role="projects/your-project-id/roles/gcpMcpServerRole"
```
### Network Security
```bash
# Firewall rules
gcloud compute firewall-rules create gcp-mcp-allow-internal \
--allow tcp:8080 \
--source-ranges 10.0.0.0/8,172.16.0.0/12,192.168.0.0/16 \
--target-tags gcp-mcp-server
gcloud compute firewall-rules create gcp-mcp-allow-lb \
--allow tcp:8080 \
--source-ranges 130.211.0.0/22,35.191.0.0/16 \
--target-tags gcp-mcp-server
```
## Monitoring and Observability
### Prometheus Metrics
```yaml
# prometheus-config.yml
global:
scrape_interval: 15s
scrape_configs:
- job_name: 'gcp-mcp'
static_configs:
- targets: ['gcp-mcp-1:9090', 'gcp-mcp-2:9090', 'gcp-mcp-3:9090']
metrics_path: /metrics
scrape_interval: 30s
scrape_timeout: 10s
```
### Grafana Dashboard
```json
{
"dashboard": {
"title": "GCP MCP Server",
"panels": [
{
"title": "Request Rate",
"type": "graph",
"targets": [
{
"expr": "rate(gcp_mcp_requests_total[5m])"
}
]
},
{
"title": "Error Rate",
"type": "graph",
"targets": [
{
"expr": "rate(gcp_mcp_errors_total[5m])"
}
]
},
{
"title": "Cache Hit Rate",
"type": "stat",
"targets": [
{
"expr": "gcp_mcp_cache_hits / (gcp_mcp_cache_hits + gcp_mcp_cache_misses) * 100"
}
]
},
{
"title": "Active Connections",
"type": "stat",
"targets": [
{
"expr": "gcp_mcp_active_connections"
}
]
}
]
}
}
```
### Log Aggregation (ELK Stack)
```yaml
# logstash-gcp-mcp.conf
input {
file {
path => "/var/log/gcp-mcp/*.log"
type => "gcp-mcp"
codec => json
}
}
filter {
if [type] == "gcp-mcp" {
date {
match => [ "timestamp", "ISO8601" ]
}
if [level] == "ERROR" {
mutate {
add_tag => [ "alert" ]
}
}
}
}
output {
elasticsearch {
hosts => ["elasticsearch:9200"]
index => "gcp-mcp-%{+YYYY.MM.dd}"
}
}
```
## Backup and Disaster Recovery
### Configuration Backup
```bash
#!/bin/bash
# backup-config.sh
BACKUP_DIR="/opt/backups/gcp-mcp"
DATE=$(date +%Y%m%d_%H%M%S)
mkdir -p $BACKUP_DIR
# Backup configuration
tar -czf $BACKUP_DIR/config_$DATE.tar.gz \
/etc/gcp-mcp/ \
/opt/gcp-mcp/config.json
# Backup service account keys
cp /opt/gcp-mcp/credentials.json $BACKUP_DIR/credentials_$DATE.json
# Cleanup old backups (keep 30 days)
find $BACKUP_DIR -name "*.tar.gz" -mtime +30 -delete
find $BACKUP_DIR -name "credentials_*.json" -mtime +30 -delete
```
### Service Recovery
```bash
#!/bin/bash
# recovery.sh
# Stop service
systemctl stop gcp-mcp
# Restore from backup
LATEST_BACKUP=$(ls -t /opt/backups/gcp-mcp/config_*.tar.gz | head -1)
tar -xzf $LATEST_BACKUP -C /
# Restore credentials
LATEST_CREDS=$(ls -t /opt/backups/gcp-mcp/credentials_*.json | head -1)
cp $LATEST_CREDS /opt/gcp-mcp/credentials.json
# Start service
systemctl start gcp-mcp
systemctl status gcp-mcp
```
## Performance Tuning
### System Optimization
```bash
# /etc/sysctl.d/99-gcp-mcp.conf
# Network optimization
net.core.rmem_max = 134217728
net.core.wmem_max = 134217728
net.ipv4.tcp_rmem = 4096 87380 134217728
net.ipv4.tcp_wmem = 4096 65536 134217728
net.ipv4.tcp_congestion_control = bbr
# File descriptor limits
fs.file-max = 1000000
# Memory management
vm.swappiness = 10
vm.dirty_ratio = 15
vm.dirty_background_ratio = 5
```
```bash
# /etc/security/limits.d/gcp-mcp.conf
gcpmcp soft nofile 65536
gcpmcp hard nofile 65536
gcpmcp soft nproc 32768
gcpmcp hard nproc 32768
```
### Application Tuning
```python
# performance_config.py
PERFORMANCE_CONFIG = {
"worker_processes": "auto", # Based on CPU cores
"worker_connections": 4096,
"keepalive_timeout": 65,
"client_max_body_size": "10m",
"proxy_cache_size": "512m",
"proxy_cache_inactive": "60m",
"gzip_compression": True,
"brotli_compression": True
}
```
## Troubleshooting
### Common Issues
1. **Authentication Failures**
```bash
# Check service account permissions
gcloud auth activate-service-account --key-file=credentials.json
gcloud projects get-iam-policy your-project-id
# Test API access
gcloud logging logs list --limit=5
```
2. **High Memory Usage**
```bash
# Monitor memory usage
ps aux | grep gcp-mcp
free -h
# Adjust cache settings
echo "Reduce cache.max_size_mb in config.json"
```
3. **Rate Limiting Issues**
```bash
# Check rate limit stats
curl http://localhost:8080/stats | jq '.rate_limiter'
# Adjust limits in configuration
```
### Log Analysis
```bash
# Error analysis
grep "ERROR" /var/log/gcp-mcp/server.log | tail -20
# Performance analysis
grep "slow_query" /var/log/gcp-mcp/server.log
# Authentication issues
grep "authentication" /var/log/gcp-mcp/server.log
```
## Maintenance
### Regular Maintenance Tasks
```bash
#!/bin/bash
# maintenance.sh
# Rotate logs
logrotate /etc/logrotate.d/gcp-mcp
# Clean cache
redis-cli FLUSHDB
# Update dependencies
pip install --upgrade -r requirements.txt
# Restart services
systemctl restart gcp-mcp
systemctl restart nginx
systemctl restart redis
```
### Health Checks
```bash
#!/bin/bash
# health-check.sh
# Service status
systemctl is-active gcp-mcp
# API health
curl -f http://localhost:8080/health
# Cache health
redis-cli ping
# Disk space
df -h /var/log/gcp-mcp
# Memory usage
free -m
```
This enterprise deployment guide provides comprehensive coverage of production deployment considerations, security, monitoring, and maintenance for the GCP MCP Server.