Skip to main content
Glama
devops-cicd-infrastructure-2025.md14.2 kB
# DevOps, CI/CD & Infrastructure 2025 **Updated**: 2025-11-23 | **Stack**: Docker, Kubernetes, GitHub Actions, Terraform --- ## Docker Containerization ```dockerfile # Multi-stage Dockerfile (Node.js app) # Stage 1: Build FROM node:20-alpine AS builder WORKDIR /app # Copy package files COPY package*.json ./ # Install dependencies RUN npm ci --only=production # Copy source code COPY . . # Build app (if using TypeScript) RUN npm run build # Stage 2: Production FROM node:20-alpine WORKDIR /app # Copy only necessary files from builder COPY --from=builder /app/dist ./dist COPY --from=builder /app/node_modules ./node_modules COPY --from=builder /app/package.json ./ # Create non-root user RUN addgroup -g 1001 -S nodejs && \ adduser -S nodejs -u 1001 USER nodejs EXPOSE 3000 CMD ["node", "dist/index.js"] # Build: docker build -t myapp:1.0 . # Run: docker run -p 3000:3000 myapp:1.0 --- # Docker Compose (Multi-container) version: '3.8' services: # Web app web: build: . ports: - "3000:3000" environment: - DATABASE_URL=postgresql://postgres:password@db:5432/myapp - REDIS_URL=redis://redis:6379 depends_on: - db - redis restart: unless-stopped # PostgreSQL database db: image: postgres:16-alpine environment: POSTGRES_DB: myapp POSTGRES_USER: postgres POSTGRES_PASSWORD: password volumes: - postgres_data:/var/lib/postgresql/data ports: - "5432:5432" restart: unless-stopped # Redis cache redis: image: redis:7-alpine ports: - "6379:6379" restart: unless-stopped volumes: postgres_data: # Run: docker-compose up -d # Stop: docker-compose down # Logs: docker-compose logs -f web --- # Best Practices MULTI-STAGE BUILDS: - Separate build and runtime - Smaller final image (exclude build tools) - Example: 1.2GB → 150MB .dockerignore: node_modules .git .env *.log dist coverage SECURITY: - Use official images (node:20-alpine) - Non-root user (RUN adduser) - Scan for vulnerabilities (docker scan myapp:1.0) - Pin versions (node:20.5.0, not node:latest) LAYERS: - Order matters (least changing → most changing) - Copy package.json first (cache dependencies) - Copy source last (changes frequently) ``` --- ## Kubernetes Orchestration ```yaml # Deployment (app.yaml) apiVersion: apps/v1 kind: Deployment metadata: name: myapp labels: app: myapp spec: replicas: 3 # 3 pods for high availability selector: matchLabels: app: myapp template: metadata: labels: app: myapp spec: containers: - name: myapp image: myrepo/myapp:1.0.0 ports: - containerPort: 3000 env: - name: DATABASE_URL valueFrom: secretKeyRef: name: db-secret key: url resources: requests: memory: "128Mi" cpu: "100m" limits: memory: "256Mi" cpu: "200m" livenessProbe: httpGet: path: /health port: 3000 initialDelaySeconds: 30 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 3000 initialDelaySeconds: 5 periodSeconds: 5 --- # Service (expose pods) apiVersion: v1 kind: Service metadata: name: myapp-service spec: selector: app: myapp ports: - protocol: TCP port: 80 targetPort: 3000 type: LoadBalancer # or ClusterIP, NodePort --- # Horizontal Pod Autoscaler apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: myapp-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: myapp minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 # Commands # Apply: kubectl apply -f app.yaml # Get pods: kubectl get pods # Logs: kubectl logs -f <pod-name> # Exec: kubectl exec -it <pod-name> -- /bin/sh # Describe: kubectl describe pod <pod-name> # Delete: kubectl delete -f app.yaml --- # ConfigMap (non-sensitive config) apiVersion: v1 kind: ConfigMap metadata: name: app-config data: API_URL: "https://api.example.com" LOG_LEVEL: "info" --- # Secret (sensitive data) apiVersion: v1 kind: Secret metadata: name: db-secret type: Opaque data: url: cG9zdGdyZXNxbDovLy4uLg== # base64 encoded # Create: kubectl create secret generic db-secret --from-literal=url='postgresql://...' ``` --- ## CI/CD Pipeline ```yaml # GitHub Actions (.github/workflows/deploy.yml) name: CI/CD Pipeline on: push: branches: [main] pull_request: branches: [main] env: REGISTRY: ghcr.io IMAGE_NAME: ${{ github.repository }} jobs: # Job 1: Test test: runs-on: ubuntu-latest steps: - name: Checkout code uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' - name: Install dependencies run: npm ci - name: Run linter run: npm run lint - name: Run tests run: npm test - name: Upload coverage uses: codecov/codecov-action@v3 with: files: ./coverage/lcov.info # Job 2: Build & Push Docker image build: needs: test runs-on: ubuntu-latest if: github.event_name == 'push' && github.ref == 'refs/heads/main' permissions: contents: read packages: write steps: - name: Checkout code uses: actions/checkout@v4 - name: Log in to Container Registry uses: docker/login-action@v3 with: registry: ${{ env.REGISTRY }} username: ${{ github.actor }} password: ${{ secrets.GITHUB_TOKEN }} - name: Extract metadata id: meta uses: docker/metadata-action@v5 with: images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} tags: | type=sha,prefix={{branch}}- type=semver,pattern={{version}} type=raw,value=latest,enable={{is_default_branch}} - name: Build and push uses: docker/build-push-action@v5 with: context: . push: true tags: ${{ steps.meta.outputs.tags }} labels: ${{ steps.meta.outputs.labels }} cache-from: type=gha cache-to: type=gha,mode=max # Job 3: Deploy to Kubernetes deploy: needs: build runs-on: ubuntu-latest if: github.event_name == 'push' && github.ref == 'refs/heads/main' steps: - name: Checkout code uses: actions/checkout@v4 - name: Setup kubectl uses: azure/setup-kubectl@v3 - name: Configure Kubernetes run: | echo "${{ secrets.KUBE_CONFIG }}" | base64 -d > kubeconfig.yaml export KUBECONFIG=kubeconfig.yaml - name: Deploy to Kubernetes run: | kubectl set image deployment/myapp \ myapp=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:main-${{ github.sha }} kubectl rollout status deployment/myapp --- # GitLab CI (.gitlab-ci.yml) stages: - test - build - deploy variables: DOCKER_DRIVER: overlay2 DOCKER_TLS_CERTDIR: "/certs" test: stage: test image: node:20 script: - npm ci - npm run lint - npm test coverage: '/All files[^|]*\|[^|]*\s+([\d\.]+)/' artifacts: reports: coverage_report: coverage_format: cobertura path: coverage/cobertura-coverage.xml build: stage: build image: docker:24 services: - docker:24-dind before_script: - docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY script: - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA . - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA only: - main deploy: stage: deploy image: bitnami/kubectl:latest script: - kubectl config use-context $KUBE_CONTEXT - kubectl set image deployment/myapp myapp=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHORT_SHA - kubectl rollout status deployment/myapp only: - main when: manual # Require manual approval ``` --- ## Infrastructure as Code (Terraform) ```hcl # main.tf (AWS EKS cluster) terraform { required_version = ">= 1.0" required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } backend "s3" { bucket = "my-terraform-state" key = "eks/terraform.tfstate" region = "us-east-1" } } provider "aws" { region = var.aws_region } # VPC module "vpc" { source = "terraform-aws-modules/vpc/aws" version = "~> 5.0" name = "${var.cluster_name}-vpc" cidr = "10.0.0.0/16" azs = ["us-east-1a", "us-east-1b", "us-east-1c"] private_subnets = ["10.0.1.0/24", "10.0.2.0/24", "10.0.3.0/24"] public_subnets = ["10.0.101.0/24", "10.0.102.0/24", "10.0.103.0/24"] enable_nat_gateway = true single_nat_gateway = false enable_dns_hostnames = true tags = { Environment = var.environment ManagedBy = "Terraform" } } # EKS Cluster module "eks" { source = "terraform-aws-modules/eks/aws" version = "~> 19.0" cluster_name = var.cluster_name cluster_version = "1.28" vpc_id = module.vpc.vpc_id subnet_ids = module.vpc.private_subnets # Managed node groups eks_managed_node_groups = { general = { min_size = 2 max_size = 10 desired_size = 3 instance_types = ["t3.medium"] capacity_type = "ON_DEMAND" labels = { role = "general" } } } tags = { Environment = var.environment } } # RDS Database resource "aws_db_instance" "postgres" { identifier = "${var.cluster_name}-db" engine = "postgres" engine_version = "16.1" instance_class = "db.t3.micro" allocated_storage = 20 max_allocated_storage = 100 storage_encrypted = true db_name = var.db_name username = var.db_username password = var.db_password vpc_security_group_ids = [aws_security_group.db.id] db_subnet_group_name = aws_db_subnet_group.db.name backup_retention_period = 7 skip_final_snapshot = false final_snapshot_identifier = "${var.cluster_name}-db-final" tags = { Environment = var.environment } } # Variables (variables.tf) variable "aws_region" { default = "us-east-1" } variable "cluster_name" { default = "my-eks-cluster" } variable "environment" { default = "production" } # Outputs (outputs.tf) output "cluster_endpoint" { value = module.eks.cluster_endpoint } output "cluster_name" { value = module.eks.cluster_id } output "db_endpoint" { value = aws_db_instance.postgres.endpoint } # Commands: # terraform init # Initialize # terraform plan # Preview changes # terraform apply # Apply changes # terraform destroy # Destroy infrastructure # terraform fmt # Format code # terraform validate # Validate syntax ``` --- ## Monitoring & Observability ```yaml # Prometheus (metrics) # prometheus-config.yaml global: scrape_interval: 15s scrape_configs: - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port] action: replace regex: ([^:]+)(?::\d+)?;(\d+) replacement: $1:$2 target_label: __address__ --- # Grafana Dashboard (JSON) { "dashboard": { "title": "Application Metrics", "panels": [ { "title": "Request Rate", "targets": [ { "expr": "rate(http_requests_total[5m])" } ] }, { "title": "Error Rate", "targets": [ { "expr": "rate(http_requests_total{status=~\"5..\"}[5m])" } ] }, { "title": "Latency (p95)", "targets": [ { "expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))" } ] } ] } } --- # Application Instrumentation (Node.js) const express = require('express'); const client = require('prom-client'); const app = express(); // Create metrics const httpRequestsTotal = new client.Counter({ name: 'http_requests_total', help: 'Total number of HTTP requests', labelNames: ['method', 'route', 'status'] }); const httpRequestDuration = new client.Histogram({ name: 'http_request_duration_seconds', help: 'Duration of HTTP requests in seconds', labelNames: ['method', 'route', 'status'] }); // Middleware to collect metrics app.use((req, res, next) => { const start = Date.now(); res.on('finish', () => { const duration = (Date.now() - start) / 1000; httpRequestsTotal.inc({ method: req.method, route: req.route?.path || req.path, status: res.statusCode }); httpRequestDuration.observe({ method: req.method, route: req.route?.path || req.path, status: res.statusCode }, duration); }); next(); }); // Expose metrics endpoint app.get('/metrics', async (req, res) => { res.set('Content-Type', client.register.contentType); res.end(await client.register.metrics()); }); app.listen(3000); ``` --- ## Key Takeaways 1. **Immutable infrastructure** - Treat servers as cattle, not pets (replace, don't patch) 2. **Everything as code** - Infrastructure, config, pipelines (version control all) 3. **Automate everything** - Manual = error-prone (CI/CD, auto-scaling) 4. **Observability** - You can't fix what you can't see (metrics, logs, traces) 5. **Security first** - Secrets management, least privilege, vulnerability scanning --- ## References - "The Phoenix Project" - Gene Kim - "Site Reliability Engineering" - Google - Kubernetes Documentation **Related**: `kubernetes-advanced.md`, `terraform-best-practices.md`, `monitoring-observability.md`

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/seanshin0214/persona-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server