DevOps AI Toolkit

deployment.md•18.1 KiB

# AI Engine Deployment **Deploy the DevOps AI Toolkit Engine to Kubernetes using Helm chart — production-ready deployment.** > **For the easiest setup**, we recommend installing the complete dot-ai stack which includes all components pre-configured. See the [Stack Installation Guide](https://devopstoolkit.ai/docs/stack). > > Continue below if you want to install components individually (for granular control over configuration). ## Overview The DevOps AI Toolkit Engine provides: 1. **Kubernetes Deployment Recommendations** — AI-powered application deployment assistance with enhanced semantic understanding 2. **Cluster Query** — Natural language interface for querying cluster resources, status, and health 3. **Capability Management** — Discover and store semantic resource capabilities for intelligent recommendation matching 4. **Pattern Management** — Organizational deployment patterns that enhance AI recommendations 5. **Policy Management** — Governance policies that guide users toward compliant configurations with optional Kyverno enforcement 6. **Kubernetes Issue Remediation** — AI-powered root cause analysis and automated remediation 7. **Shared Prompts Library** — Centralized prompt sharing via native slash commands 8. **REST API Gateway** — HTTP endpoints for all toolkit capabilities Access these tools through [MCP clients](/docs/mcp) or the [CLI](https://devopstoolkit.ai/docs/cli). ## What You Get - **Production Kubernetes Deployment** — Scalable deployment with proper resource management - **Integrated Qdrant Database** — Vector database for capability and pattern management - **External Access** — Ingress configuration for team collaboration - **Resource Management** — Proper CPU/memory limits and requests - **Security** — RBAC and ServiceAccount configuration ## Prerequisites - Kubernetes cluster (1.19+) with kubectl access - Helm 3.x installed - AI model API key (default: Anthropic). See [AI Model Configuration](#ai-model-configuration) for available model options. - OpenAI API key (required for vector embeddings) - Ingress controller (any standard controller) ## Quick Start (5 Minutes) ### Step 1: Set Environment Variables Export your API keys and auth token: ```bash # Required export ANTHROPIC_API_KEY="sk-ant-api03-..." export OPENAI_API_KEY="sk-proj-..." export DOT_AI_AUTH_TOKEN=$(openssl rand -base64 32) # Ingress class - change to match your ingress controller (traefik, haproxy, etc.) export INGRESS_CLASS_NAME="nginx" ``` ### Step 2: Install the Controller Install the dot-ai-controller to enable autonomous cluster operations: ```bash # Set the controller version from https://github.com/vfarcic/dot-ai-controller/pkgs/container/dot-ai-controller%2Fcharts%2Fdot-ai-controller export DOT_AI_CONTROLLER_VERSION="..." # Install controller (includes CRDs for Solution and RemediationPolicy) helm install dot-ai-controller \ oci://ghcr.io/vfarcic/dot-ai-controller/charts/dot-ai-controller:$DOT_AI_CONTROLLER_VERSION \ --namespace dot-ai \ --create-namespace \ --wait ``` The controller provides CRDs for autonomous cluster operations. Create Custom Resources like CapabilityScanConfig, Solution, RemediationPolicy, or ResourceSyncConfig to enable features such as capability scanning, solution tracking, and more. See the [Controller Setup Guide](https://devopstoolkit.ai/docs/controller/setup-guide) for complete details. ### Step 3: Install the Server Install the server using the published Helm chart: ```bash # Set the version from https://github.com/vfarcic/dot-ai/pkgs/container/dot-ai%2Fcharts%2Fdot-ai export DOT_AI_VERSION="..." helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \ --set secrets.anthropic.apiKey="$ANTHROPIC_API_KEY" \ --set secrets.openai.apiKey="$OPENAI_API_KEY" \ --set secrets.auth.token="$DOT_AI_AUTH_TOKEN" \ --set ingress.enabled=true \ --set ingress.className="$INGRESS_CLASS_NAME" \ --set ingress.host="dot-ai.127.0.0.1.nip.io" \ --set controller.enabled=true \ --namespace dot-ai \ --wait ``` **Notes**: - Replace `dot-ai.127.0.0.1.nip.io` with your desired hostname for external access. - For enhanced security, create a secret named `dot-ai-secrets` with keys `anthropic-api-key`, `openai-api-key`, and `auth-token` instead of using `--set` arguments. - For all available configuration options, see the [Helm values file](https://github.com/vfarcic/dot-ai/blob/main/charts/values.yaml). - **Global annotations**: Add annotations to all Kubernetes resources using `annotations` in your values file (e.g., for [Reloader](https://github.com/stakater/Reloader) integration: `reloader.stakater.com/auto: "true"`). - **Custom endpoints** (OpenRouter, self-hosted): See [Custom Endpoint Configuration](#custom-endpoint-configuration) for environment variables, then use `--set` or values file with `ai.customEndpoint.enabled=true` and `ai.customEndpoint.baseURL`. - **Observability/Tracing**: Add tracing environment variables via `extraEnv` in your values file. See [Observability Guide](../operations/observability.md) for complete configuration. - **User-Defined Prompts**: Load custom prompts from your git repository via `extraEnv`. See [User-Defined Prompts](../tools/prompts.md#user-defined-prompts) for configuration. ### Step 4: Connect a Client With the server running, connect using your preferred access method: - **[MCP Client Setup](/docs/mcp)** — Connect via MCP protocol from Claude Code, Cursor, or other MCP clients - **[CLI](https://devopstoolkit.ai/docs/cli)** — Use the command-line interface for terminal and CI/CD pipelines ## Capability Scanning for AI Recommendations Many MCP tools depend on **capability data** to function: - **recommend**: Uses capabilities to find resources matching your deployment intent - **manageOrgData** (patterns): References capabilities when applying organizational patterns - **manageOrgData** (policies): Validates resources against stored capability metadata Without capability data, these tools may not work or will produce poor results. ### Enabling Capability Scanning Create a `CapabilityScanConfig` CR to enable autonomous capability discovery. The controller watches for CRD changes and automatically scans new resources. See the [Capability Scan Guide](https://devopstoolkit.ai/docs/controller/capability-scan-guide) for setup instructions. ## AI Model Configuration The DevOps AI Toolkit supports multiple AI models. Choose your model by setting the `AI_PROVIDER` environment variable. ### Model Requirements All AI models must meet these minimum requirements: - **Context window**: 200K+ tokens (some tools like capability scanning use large context) - **Output tokens**: 8K+ tokens (for YAML generation and policy creation) - **Function calling**: Required for MCP tool interactions ### Available Models | Provider | Model | AI_PROVIDER | API Key Required | Recommended | |----------|-------|-------------|------------------|-------------| | **Anthropic** | Claude Haiku 4.5 | `anthropic_haiku` | `ANTHROPIC_API_KEY` | Yes | | **Anthropic** | Claude Opus 4.6 | `anthropic_opus` | `ANTHROPIC_API_KEY` | Yes | | **Anthropic** | Claude Sonnet 4.6 | `anthropic` | `ANTHROPIC_API_KEY` | Yes | | **AWS** | Amazon Bedrock | `amazon_bedrock` | AWS credentials ([see setup](https://docs.aws.amazon.com/cli/latest/userguide/cli-configure-files.html)) | Yes | | **Google** | Gemini 3.1 Pro | `google` | `GOOGLE_GENERATIVE_AI_API_KEY` | Yes (might be slow) | | **Google** | Gemini 3 Flash | `google_flash` | `GOOGLE_GENERATIVE_AI_API_KEY` | Yes (preview) | | **Host** | Host Environment LLM | `host` | None (uses host's AI) | Yes (if supported) | | **Moonshot AI** | Kimi K2 | `kimi` | `MOONSHOT_API_KEY` | Yes | | **Moonshot AI** | Kimi K2 Thinking | `kimi_thinking` | `MOONSHOT_API_KEY` | Yes (might be slow) | | **OpenAI** | GPT-5.1 Codex | `openai` | `OPENAI_API_KEY` | No * | | **xAI** | Grok-4 | `xai` | `XAI_API_KEY` | No * | \* **Note**: These models may not perform as well as other providers for complex DevOps reasoning tasks. ### Models Not Supported | Provider | Model | Reason | |----------|-------|--------| | **DeepSeek** | DeepSeek V3.2 (`deepseek-chat`) | 128K context limit insufficient for heavy workflows | | **DeepSeek** | DeepSeek R1 (`deepseek-reasoner`) | 64K context limit insufficient for most workflows | **Why DeepSeek is not supported**: Integration testing revealed that DeepSeek's context window limitations (128K for V3.2, 64K for R1) cause failures in context-heavy operations like Kyverno policy generation, which can exceed 130K tokens. The toolkit requires 200K+ context for reliable operation across all features. ### Helm Configuration Set AI provider in your Helm values: ```yaml ai: provider: anthropic_haiku # or anthropic, anthropic_opus, google, etc. secrets: anthropic: apiKey: "your-api-key" ``` Or via `--set`: ```bash helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \ --set ai.provider=anthropic_haiku \ --set secrets.anthropic.apiKey="$ANTHROPIC_API_KEY" \ # ... other settings ``` **AI Keys Are Optional**: The MCP server starts successfully without AI API keys. Tools like **Shared Prompts Library** and **REST API Gateway** work without AI. AI-powered tools (deployment recommendations, remediation, pattern/policy management, capability scanning) require AI keys (unless using the `host` provider) and will show helpful error messages when accessed without configuration. ## Embedding Provider Configuration The DevOps AI Toolkit supports multiple embedding providers for semantic search capabilities in pattern management, capability discovery, and policy matching. ### Available Embedding Providers | Provider | EMBEDDINGS_PROVIDER | Model | Dimensions | API Key Required | |----------|-------------------|-------|------------|------------------| | **Amazon Bedrock** | `amazon_bedrock` | `amazon.titan-embed-text-v2:0` | 1024 | AWS credentials | | **Google** | `google` | `text-embedding-004` (deprecated) | 768 | `GOOGLE_API_KEY` | | **Google** | `google` | `gemini-embedding-001` | 768 | `GOOGLE_API_KEY` | | **OpenAI** | `openai` (default) | `text-embedding-3-small` | 1536 | `OPENAI_API_KEY` | ### Helm Configuration Set embedding provider via `extraEnv` in your values file: ```yaml extraEnv: - name: EMBEDDINGS_PROVIDER value: "google" - name: GOOGLE_API_KEY valueFrom: secretKeyRef: name: dot-ai-secrets key: google-api-key ``` **Notes:** - **Same Provider**: If using the same provider for both AI models and embeddings (e.g., `AI_PROVIDER=google` and `EMBEDDINGS_PROVIDER=google`), you only need to set one API key - **Mixed Providers**: You can use different providers for AI models and embeddings (e.g., `AI_PROVIDER=anthropic` with `EMBEDDINGS_PROVIDER=google`) - **Embedding Support**: Not all AI model providers support embeddings. Anthropic does not provide embeddings; use OpenAI, Google, or Amazon Bedrock for embeddings - **Google Deprecation**: `text-embedding-004` will be discontinued on January 14, 2026. Use `gemini-embedding-001` for new deployments. When switching models, you must delete and recreate all embeddings (patterns, capabilities, policies) as vectors from different models are not compatible ## Custom Endpoint Configuration You can configure custom OpenAI-compatible endpoints for AI models. This enables using alternative providers like OpenRouter, self-hosted models, or air-gapped deployments. ### In-Cluster Ollama Example Deploy with a self-hosted Ollama service running in the same Kubernetes cluster: **Create a `values.yaml` file:** ```yaml ai: provider: openai model: "llama3.3:70b" # Your self-hosted model customEndpoint: enabled: true baseURL: "http://ollama-service.default.svc.cluster.local:11434/v1" secrets: customLlm: apiKey: "ollama" # Ollama doesn't require authentication openai: apiKey: "your-openai-key" # Still needed for vector embeddings ``` **Install with custom values:** ```bash helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \ --values values.yaml \ --create-namespace \ --namespace dot-ai \ --wait ``` ### Other Self-Hosted Options **vLLM (Self-Hosted):** ```yaml ai: provider: openai model: "meta-llama/Llama-3.1-70B-Instruct" customEndpoint: enabled: true baseURL: "http://vllm-service:8000/v1" secrets: customLlm: apiKey: "dummy" # vLLM may not require authentication openai: apiKey: "your-openai-key" ``` **LocalAI (Self-Hosted):** ```yaml ai: provider: openai model: "your-model-name" customEndpoint: enabled: true baseURL: "http://localai-service:8080/v1" secrets: customLlm: apiKey: "dummy" openai: apiKey: "your-openai-key" ``` ### OpenRouter Example OpenRouter provides access to 100+ LLM models from multiple providers: ```yaml ai: provider: openai model: "anthropic/claude-3.5-sonnet" customEndpoint: enabled: true baseURL: "https://openrouter.ai/api/v1" secrets: customLlm: apiKey: "sk-or-v1-your-key-here" openai: apiKey: "your-openai-key" # Still needed for embeddings ``` **Note**: OpenRouter does not support embedding models. Use OpenAI, Google, or Amazon Bedrock for embeddings. Get your OpenRouter API key at [https://openrouter.ai/](https://openrouter.ai/) ### Important Notes - **Context window**: 200K+ tokens recommended - **Output tokens**: 8K+ tokens minimum - **Function calling**: Must support OpenAI-compatible function calling **Testing Status:** - Validated with OpenRouter (alternative SaaS provider) - Not yet tested with self-hosted Ollama, vLLM, or LocalAI - We need your help testing! Report results in [issue #193](https://github.com/vfarcic/dot-ai/issues/193) **Notes:** - OpenAI API key is still required for vector embeddings (Qdrant operations) - If model requirements are too high for your setup, please open an issue - Configuration examples are based on common patterns but not yet validated ## TLS Configuration To enable HTTPS, add these values (requires [cert-manager](https://cert-manager.io/) with a ClusterIssuer): ```yaml ingress: tls: enabled: true clusterIssuer: letsencrypt # Your ClusterIssuer name ``` Then update your `.mcp.json` URL to use `https://`. ## Web UI Visualization Enable rich visualizations of query results by connecting to a [DevOps AI Web UI](https://github.com/vfarcic/dot-ai-ui) instance. When configured, the query tool includes a `visualizationUrl` field in responses that opens interactive visualizations (resource topology, relationships, health status) in your browser. ### Configuration Add the Web UI base URL to your Helm values: ```yaml webUI: baseUrl: "https://dot-ai-ui.example.com" # Your Web UI instance URL ``` Or via `--set`: ```bash helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \ --set webUI.baseUrl="https://dot-ai-ui.example.com" \ # ... other settings ``` ### Feature Toggle Behavior - **Not configured** (default): Query responses contain only text summaries. No `visualizationUrl` field is included. - **Configured**: Query responses include a `visualizationUrl` field (format: `{baseUrl}/v/{sessionId}`) that opens the visualization in the Web UI. ### Example Query Response When `webUI.baseUrl` is configured, query responses include: ```text **View visualization**: https://dot-ai-ui.example.com/v/abc123-session-id ``` This URL opens an interactive visualization of the query results in the Web UI. ## Gateway API (Alternative to Ingress) For Kubernetes 1.26+, you can use **Gateway API v1** for advanced traffic management with role-oriented design (platform teams manage Gateways, app teams create routes). ### When to Use **Use Gateway API when:** - Running Kubernetes 1.26+ with Gateway API support - Need advanced routing (weighted traffic, header-based routing) - Prefer separation of infrastructure and application concerns **Use Ingress when:** - Running Kubernetes < 1.26 - Simpler requirements met by Ingress features ### Prerequisites - Kubernetes 1.26+ cluster - Gateway API CRDs installed: `kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.4.1/standard-install.yaml` - Gateway controller running (Istio, Envoy Gateway, Kong, etc.) - Existing Gateway resource created by platform team (reference pattern) ### Quick Start (Reference Pattern - RECOMMENDED) Reference an existing platform-managed Gateway: ```bash helm install dot-ai-mcp oci://ghcr.io/vfarcic/dot-ai/charts/dot-ai:$DOT_AI_VERSION \ --set secrets.anthropic.apiKey="$ANTHROPIC_API_KEY" \ --set secrets.openai.apiKey="$OPENAI_API_KEY" \ --set secrets.auth.token="$DOT_AI_AUTH_TOKEN" \ --set ingress.enabled=false \ --set gateway.name="cluster-gateway" \ --set gateway.namespace="gateway-system" \ --namespace dot-ai \ --wait ``` ### Configuration Reference ```yaml # Reference pattern (RECOMMENDED) gateway: name: "cluster-gateway" # Existing Gateway name namespace: "gateway-system" # Gateway namespace (optional) timeouts: request: "3600s" # SSE streaming timeout backendRequest: "3600s" # Creation pattern (development/testing only) gateway: create: true # Create Gateway (NOT for production) className: "istio" # GatewayClass name ``` ### Complete Guide See **[Gateway API Deployment Guide](gateway-api.md)** for: - Platform team Gateway setup (HTTP and HTTPS) - Application team deployment steps - Cross-namespace access (ReferenceGrant) - Development/testing creation pattern - Troubleshooting and verification - Migration from Ingress ## Next Steps Once the server is running: ### 1. Explore Tools - **[Tools Overview](../tools/overview.md)** — Complete guide to all available tools, how they work together, and recommended usage flow ### 2. Enable Observability (Optional) - **[Observability Guide](../operations/observability.md)** — Distributed tracing with OpenTelemetry for debugging workflows, measuring AI performance, and monitoring Kubernetes operations ### 3. Production Considerations - Consider backup strategies for vector database content (organizational patterns and capabilities) - Review [TLS Configuration](#tls-configuration) for HTTPS ## Support - **Bug Reports**: [GitHub Issues](https://github.com/vfarcic/dot-ai/issues)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/vfarcic/dot-ai'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

deployment.md•18.1 KiB