Skip to main content
Glama
devops-infrastructure.mdc18.2 kB
--- description: "DevOps and infrastructure: Docker, Kubernetes, Terraform, CI/CD pipelines, and cloud deployment patterns" globs: ["**/Dockerfile*", "**/*.tf", "**/docker-compose*.yml", "**/docker-compose*.yaml", "**/k8s/**", "**/.github/workflows/**", "**/.gitlab-ci.yml", "**/Jenkinsfile", "**/cloudbuild.yaml", "**/terraform/**", "**/helm/**"] alwaysApply: false --- # DevOps & Infrastructure Patterns Containerization, orchestration, infrastructure as code, and CI/CD best practices. ## CRITICAL: Agentic-First DevOps ### Pre-Development Verification (MANDATORY) Before writing ANY DevOps configuration: ``` 1. CHECK TOOL AVAILABILITY → run_terminal_cmd("docker --version") → run_terminal_cmd("kubectl version --client") → run_terminal_cmd("terraform --version") → run_terminal_cmd("helm version") 2. VERIFY CURRENT VERSIONS (use web_search) → web_search("Terraform latest version December 2024") → web_search("Kubernetes latest stable version 2024") → web_search("Docker best practices 2024") 3. CHECK EXISTING INFRASTRUCTURE → Read existing Dockerfile, docker-compose.yml, *.tf files → Understand current state before modifying → Check terraform.tfstate or remote state 4. VALIDATE CONFIGURATIONS BEFORE APPLYING → terraform validate → docker-compose config → kubectl apply --dry-run=client ``` ### CLI-First DevOps Workflow **ALWAYS use CLI for validation:** ```bash # Docker docker build -t test:latest . docker-compose config # Validate compose file docker-compose up --dry-run # Test without running # Kubernetes kubectl apply --dry-run=client -f manifest.yaml kubectl diff -f manifest.yaml # See changes before applying kubeval manifest.yaml # Validate against schema # Terraform terraform init terraform fmt -recursive terraform validate terraform plan -out=tfplan # ALWAYS plan before apply # Helm helm lint ./my-chart helm template ./my-chart # Render templates locally helm install --dry-run --debug my-release ./my-chart ``` ### Post-Edit Verification After ANY infrastructure code changes: ```bash # Docker docker build --no-cache -t test:latest . docker run --rm test:latest echo "Build verified" # Terraform terraform fmt -check -recursive terraform validate terraform plan # Kubernetes kubectl apply --dry-run=server -f . # Server-side validation kubectl get events --sort-by='.lastTimestamp' # Check for issues ``` ### Common DevOps Syntax Traps (Avoid These!) ```yaml # WRONG: YAML indentation with tabs services: app: # Tab character - YAML error! image: nginx # CORRECT: Always use spaces (2 spaces standard) services: app: image: nginx # WRONG: Missing quotes for special values environment: - VERSION=1.0 # Might be parsed as number - ENABLED=true # Might be parsed as boolean # CORRECT: Quote string values environment: - VERSION="1.0" - ENABLED="true" # WRONG: Hardcoded secrets in config env: - name: DB_PASSWORD value: "supersecret123" # NEVER do this! # CORRECT: Use secrets env: - name: DB_PASSWORD valueFrom: secretKeyRef: name: db-secrets key: password ``` ### Infrastructure Version Pinning Always pin versions explicitly: ```dockerfile # WRONG FROM node:latest FROM python # CORRECT - Pin major.minor at minimum FROM node:20-alpine FROM python:3.12-slim ``` ```hcl # WRONG terraform { required_providers { aws = { source = "hashicorp/aws" } } } # CORRECT - Pin provider versions terraform { required_version = ">= 1.6.0" required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } } ``` --- ## Docker ### Dockerfile Best Practices ```dockerfile # Use specific version tags FROM node:20-alpine AS builder # Set working directory WORKDIR /app # Copy dependency files first (better caching) COPY package*.json ./ # Install dependencies RUN npm ci --only=production # Copy source code COPY . . # Build application RUN npm run build # Production stage FROM node:20-alpine AS production WORKDIR /app # Create non-root user RUN addgroup -g 1001 -S nodejs && \ adduser -S nodejs -u 1001 # Copy built assets from builder COPY --from=builder --chown=nodejs:nodejs /app/dist ./dist COPY --from=builder --chown=nodejs:nodejs /app/node_modules ./node_modules # Switch to non-root user USER nodejs # Expose port EXPOSE 3000 # Health check HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \ CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1 # Run application CMD ["node", "dist/main.js"] ``` ### Multi-Stage Builds ```dockerfile # Build stage FROM golang:1.21-alpine AS builder WORKDIR /app COPY go.* ./ RUN go mod download COPY . . RUN CGO_ENABLED=0 GOOS=linux go build -o /app/server ./cmd/server # Final stage FROM alpine:3.18 RUN apk --no-cache add ca-certificates WORKDIR /app COPY --from=builder /app/server . EXPOSE 8080 ENTRYPOINT ["./server"] ``` ### Docker Compose ```yaml version: '3.8' services: app: build: context: . dockerfile: Dockerfile target: development ports: - "3000:3000" volumes: - .:/app - /app/node_modules environment: - NODE_ENV=development - DATABASE_URL=postgres://user:pass@db:5432/mydb depends_on: db: condition: service_healthy networks: - backend db: image: postgres:15-alpine environment: POSTGRES_USER: user POSTGRES_PASSWORD: pass POSTGRES_DB: mydb volumes: - postgres_data:/var/lib/postgresql/data healthcheck: test: ["CMD-SHELL", "pg_isready -U user -d mydb"] interval: 5s timeout: 5s retries: 5 networks: - backend redis: image: redis:7-alpine ports: - "6379:6379" networks: - backend volumes: postgres_data: networks: backend: driver: bridge ``` --- ## Kubernetes ### Deployment ```yaml apiVersion: apps/v1 kind: Deployment metadata: name: myapp labels: app: myapp spec: replicas: 3 selector: matchLabels: app: myapp strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 template: metadata: labels: app: myapp spec: containers: - name: myapp image: myregistry/myapp:v1.0.0 ports: - containerPort: 8080 resources: requests: memory: "128Mi" cpu: "100m" limits: memory: "256Mi" cpu: "500m" livenessProbe: httpGet: path: /health port: 8080 initialDelaySeconds: 10 periodSeconds: 10 readinessProbe: httpGet: path: /ready port: 8080 initialDelaySeconds: 5 periodSeconds: 5 env: - name: DATABASE_URL valueFrom: secretKeyRef: name: myapp-secrets key: database-url volumeMounts: - name: config mountPath: /app/config volumes: - name: config configMap: name: myapp-config ``` ### Service ```yaml apiVersion: v1 kind: Service metadata: name: myapp spec: selector: app: myapp ports: - port: 80 targetPort: 8080 type: ClusterIP --- apiVersion: networking.k8s.io/v1 kind: Ingress metadata: name: myapp-ingress annotations: kubernetes.io/ingress.class: nginx cert-manager.io/cluster-issuer: letsencrypt-prod spec: tls: - hosts: - myapp.example.com secretName: myapp-tls rules: - host: myapp.example.com http: paths: - path: / pathType: Prefix backend: service: name: myapp port: number: 80 ``` ### ConfigMap and Secrets ```yaml apiVersion: v1 kind: ConfigMap metadata: name: myapp-config data: APP_ENV: production LOG_LEVEL: info config.yaml: | server: port: 8080 features: cache: true --- apiVersion: v1 kind: Secret metadata: name: myapp-secrets type: Opaque stringData: database-url: postgres://user:pass@db:5432/mydb api-key: your-secret-api-key ``` ### Horizontal Pod Autoscaler ```yaml apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: myapp-hpa spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: myapp minReplicas: 2 maxReplicas: 10 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 ``` --- ## Terraform ### Provider Configuration ```hcl terraform { required_version = ">= 1.5.0" required_providers { aws = { source = "hashicorp/aws" version = "~> 5.0" } } backend "s3" { bucket = "my-terraform-state" key = "prod/terraform.tfstate" region = "us-west-2" encrypt = true dynamodb_table = "terraform-locks" } } provider "aws" { region = var.aws_region default_tags { tags = { Environment = var.environment Project = var.project_name ManagedBy = "terraform" } } } ``` ### Variables and Outputs ```hcl # variables.tf variable "environment" { description = "Deployment environment" type = string validation { condition = contains(["dev", "staging", "prod"], var.environment) error_message = "Environment must be dev, staging, or prod." } } variable "instance_type" { description = "EC2 instance type" type = string default = "t3.micro" } variable "db_config" { description = "Database configuration" type = object({ instance_class = string storage_gb = number multi_az = bool }) default = { instance_class = "db.t3.micro" storage_gb = 20 multi_az = false } } # outputs.tf output "api_endpoint" { description = "API Gateway endpoint URL" value = aws_apigatewayv2_api.main.api_endpoint } output "database_endpoint" { description = "RDS endpoint" value = aws_db_instance.main.endpoint sensitive = true } ``` ### Modules ```hcl # modules/vpc/main.tf resource "aws_vpc" "main" { cidr_block = var.cidr_block enable_dns_hostnames = true enable_dns_support = true tags = { Name = "${var.project}-vpc" } } resource "aws_subnet" "public" { count = length(var.availability_zones) vpc_id = aws_vpc.main.id cidr_block = cidrsubnet(var.cidr_block, 4, count.index) availability_zone = var.availability_zones[count.index] map_public_ip_on_launch = true tags = { Name = "${var.project}-public-${count.index + 1}" Type = "public" } } # Usage in root module module "vpc" { source = "./modules/vpc" project = var.project_name cidr_block = "10.0.0.0/16" availability_zones = ["us-west-2a", "us-west-2b"] } module "ecs" { source = "./modules/ecs" vpc_id = module.vpc.vpc_id subnet_ids = module.vpc.public_subnet_ids depends_on = [module.vpc] } ``` --- ## GitHub Actions ### CI Pipeline ```yaml name: CI on: push: branches: [main] pull_request: branches: [main] env: REGISTRY: ghcr.io IMAGE_NAME: ${{ github.repository }} jobs: test: runs-on: ubuntu-latest steps: - uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' - name: Install dependencies run: npm ci - name: Lint run: npm run lint - name: Test run: npm test -- --coverage - name: Upload coverage uses: codecov/codecov-action@v3 build: needs: test runs-on: ubuntu-latest permissions: contents: read packages: write steps: - uses: actions/checkout@v4 - name: Log in to Container Registry uses: docker/login-action@v3 with: registry: ${{ env.REGISTRY }} username: ${{ github.actor }} password: ${{ secrets.GITHUB_TOKEN }} - name: Extract metadata id: meta uses: docker/metadata-action@v5 with: images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }} tags: | type=sha,prefix= type=ref,event=branch type=semver,pattern={{version}} - name: Build and push uses: docker/build-push-action@v5 with: context: . push: ${{ github.event_name != 'pull_request' }} tags: ${{ steps.meta.outputs.tags }} labels: ${{ steps.meta.outputs.labels }} cache-from: type=gha cache-to: type=gha,mode=max ``` ### CD Pipeline ```yaml name: Deploy on: push: tags: - 'v*' jobs: deploy-staging: runs-on: ubuntu-latest environment: staging steps: - uses: actions/checkout@v4 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: us-west-2 - name: Deploy to ECS run: | aws ecs update-service \ --cluster staging-cluster \ --service myapp \ --force-new-deployment deploy-production: needs: deploy-staging runs-on: ubuntu-latest environment: production steps: - uses: actions/checkout@v4 - name: Configure AWS credentials uses: aws-actions/configure-aws-credentials@v4 with: aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }} aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }} aws-region: us-west-2 - name: Deploy to ECS run: | aws ecs update-service \ --cluster production-cluster \ --service myapp \ --force-new-deployment ``` --- ## Monitoring & Observability ### Prometheus Metrics ```yaml apiVersion: v1 kind: ConfigMap metadata: name: prometheus-config data: prometheus.yml: | global: scrape_interval: 15s scrape_configs: - job_name: 'kubernetes-pods' kubernetes_sd_configs: - role: pod relabel_configs: - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape] action: keep regex: true - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path] action: replace target_label: __metrics_path__ regex: (.+) ``` ### Application Logging ```yaml # Fluent Bit configuration apiVersion: v1 kind: ConfigMap metadata: name: fluent-bit-config data: fluent-bit.conf: | [SERVICE] Flush 1 Log_Level info Daemon off Parsers_File parsers.conf [INPUT] Name tail Path /var/log/containers/*.log Parser docker Tag kube.* Refresh_Interval 5 [FILTER] Name kubernetes Match kube.* Kube_URL https://kubernetes.default.svc:443 Kube_Tag_Prefix kube.var.log.containers. [OUTPUT] Name es Match * Host elasticsearch Port 9200 Index logs ``` --- ## Security Best Practices ### Container Security ```dockerfile # Use distroless or minimal base images FROM gcr.io/distroless/nodejs20-debian12 # Never run as root USER nonroot:nonroot # Use specific versions, not latest FROM node:20.10.0-alpine3.18 # Scan images in CI # - trivy image myapp:latest # - grype myapp:latest ``` ### Secrets Management ```yaml # External Secrets Operator apiVersion: external-secrets.io/v1beta1 kind: ExternalSecret metadata: name: myapp-secrets spec: refreshInterval: 1h secretStoreRef: kind: ClusterSecretStore name: aws-secrets-manager target: name: myapp-secrets creationPolicy: Owner data: - secretKey: database-url remoteRef: key: prod/myapp/database property: url ``` ### Network Policies ```yaml apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: myapp-network-policy spec: podSelector: matchLabels: app: myapp policyTypes: - Ingress - Egress ingress: - from: - podSelector: matchLabels: app: nginx-ingress ports: - protocol: TCP port: 8080 egress: - to: - podSelector: matchLabels: app: postgres ports: - protocol: TCP port: 5432 - to: - namespaceSelector: {} podSelector: matchLabels: k8s-app: kube-dns ports: - protocol: UDP port: 53 ``` --- ## Common Commands ### Docker ```bash # Build and run docker build -t myapp . docker run -p 3000:3000 myapp # Debug container docker exec -it <container_id> sh docker logs -f <container_id> # Clean up docker system prune -af ``` ### Kubernetes ```bash # Apply resources kubectl apply -f k8s/ # Debug pod kubectl logs -f <pod_name> kubectl exec -it <pod_name> -- sh kubectl describe pod <pod_name> # Rollout management kubectl rollout status deployment/myapp kubectl rollout undo deployment/myapp ``` ### Terraform ```bash # Initialize and plan terraform init terraform plan -out=tfplan # Apply changes terraform apply tfplan # Destroy resources terraform destroy # Format and validate terraform fmt -recursive terraform validate ```

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/madebyaris/rakitui-ai'

If you have feedback or need assistance with the MCP directory API, please join our Discord server