Kubernetes Deployment¶
Deploy AKKO on any Kubernetes cluster using the official Helm umbrella chart. Kubernetes is the primary and recommended deployment method for AKKO -- both for local development (via k3d) and production environments with high availability, horizontal scaling, and automated operations.
Why Kubernetes?¶
| Capability | Kubernetes |
|---|---|
| High availability | Multi-node, pod anti-affinity |
| Horizontal scaling | HPA, replica sets |
| Rolling updates | Zero-downtime rollouts |
| Secret management | K8s Secrets, Vault, Sealed Secrets |
| TLS certificates | cert-manager (Let's Encrypt) |
| Storage | CSI, persistent volumes |
| Monitoring | kube-prometheus-stack, native metrics |
| Multi-tenancy | Namespaces, RBAC, quotas |
| GPU scheduling | Device plugin, node selectors |
Prerequisites¶
Before deploying AKKO on Kubernetes, ensure you have:
| Tool | Version | Purpose |
|---|---|---|
kubectl |
1.28+ | Cluster management |
helm |
3.12+ | Chart deployment |
| Kubernetes cluster | 1.28+ | Any CNCF conformant distribution |
| cert-manager | 1.13+ | TLS certificate automation (optional) |
| Storage class | Any | PVC provisioning for stateful services |
Cluster sizing
For the minimal profile (core services only): 8 CPU, 16 GB RAM. For the standard profile (all non-governance): 12 CPU, 24 GB RAM. For the governance profile (includes OpenMetadata): 16 CPU, 32 GB RAM.
Quick Start with Helm¶
1. Add Helm Repositories¶
helm repo add traefik https://traefik.github.io/charts
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add minio https://charts.min.io/
helm repo add trino https://trinodb.github.io/charts
helm repo add apache-airflow https://airflow.apache.org
helm repo add superset https://apache.github.io/superset
helm repo add jupyterhub https://hub.jupyter.org/helm-chart/
helm repo add prometheus https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add openmetadata https://open-metadata.github.io/openmetadata-helm-charts
helm repo update
2. Prepare Your Values File¶
Start from one of the provided examples:
Single replicas, small PVCs, self-signed TLS. Suitable for laptops and CI.
HA replicas, production PVC sizes, cert-manager TLS, node affinity.
Pre-configured for OVHcloud Managed Kubernetes with csi-cinder-high-speed
storage class and OVH LoadBalancer annotations.
Edit my-values.yaml to set at minimum:
global.domain-- your DNS domain (e.g.,akko.example.com)- Database passwords (PostgreSQL, object storage, Keycloak)
- TLS issuer name (if using cert-manager)
3. Install¶
4. Verify¶
# Watch pods come up
kubectl -n akko get pods -w
# Check all services have endpoints
kubectl -n akko get svc
# Open the cockpit (once DNS is configured)
open https://akko.example.com
All pods should reach Running status within 5--10 minutes on first install
(container images are pulled from public registries).
Choosing a Deployment Profile¶
AKKO supports three deployment profiles, controlled by enabling or disabling
sub-charts in your values.yaml:
Minimal¶
Core analytics stack only. Best for local development, CI/CD testing, or resource-constrained environments.
Enabled: PostgreSQL, object storage, Polaris, Trino, Spark, Airflow, Superset, JupyterHub, Keycloak, Cockpit, Docs, Prometheus, Dashboards
Disabled: OpenMetadata, OpenSearch, logs layer, Ollama
Resources: ~8 GB RAM, ~4 CPU cores
Standard (default)¶
Full platform with monitoring and AI capabilities. Recommended starting point for teams evaluating AKKO.
Additionally enabled: Ollama (local LLM), logs layer (log aggregation), log shipper, Alertmanager
Resources: ~20 GB RAM, ~12 CPU cores
Governance¶
Everything, including the data catalog (OpenMetadata) and search engine (OpenSearch). For organizations that need data lineage, quality testing, glossary management, and data product governance.
Additionally enabled: OpenMetadata Server, OpenMetadata Ingestion, OpenSearch
Resources: ~28 GB RAM, ~16 CPU cores
Governance profile is resource-intensive
OpenMetadata + OpenSearch require an additional ~4 GB of RAM. On small clusters, this can cause OOM kills. Dedicate specific nodes or increase cluster capacity before enabling.
Platform-Specific Guides¶
On-Premises¶
k3s¶
k3s ships with Traefik as the default ingress controller. Disable the chart's Traefik and use the built-in one:
k3s also includes a local-path provisioner for storage. For production, consider Longhorn or Rook-Ceph.
kubeadm¶
Standard deployment. Install an ingress controller (Traefik or nginx-ingress) and a CSI storage driver before deploying AKKO.
RKE2¶
RKE2 includes nginx-ingress by default:
OVHcloud Managed Kubernetes¶
OVHcloud provides a managed Kubernetes service with:
- Automatic node scaling
csi-cinder-high-speedstorage class (SSD)- Public Load Balancer integration
- GDPR-compliant European data centers
See the full example: helm/examples/values-ovhcloud.yaml
OpenShift¶
OpenShift uses Routes instead of Ingress and enforces Security Context Constraints (SCC). The OpenShift values file handles these differences:
# Grant required SCCs
oc adm policy add-scc-to-user nonroot-v2 -z akko-postgresql -n akko
oc adm policy add-scc-to-user nonroot-v2 -z akko-minio -n akko
# Deploy
helm install akko ./helm/akko -n akko \
-f helm/examples/values-openshift.yaml
See the full example: helm/examples/values-openshift.yaml
Storage¶
Every stateful service uses a PersistentVolumeClaim. Key volumes:
| Service | Purpose | Dev Size | Prod Size |
|---|---|---|---|
| PostgreSQL | Shared relational database | 5 Gi | 50 Gi |
| object storage | S3 data lake (Iceberg tables) | 5 Gi | 100 Gi |
| Prometheus | Metrics history | 2 Gi | 50 Gi |
| logs layer | Log aggregation | 2 Gi | 50 Gi |
| Ollama | LLM model files | 5 Gi | 20 Gi |
| Airflow | DAG logs | 1 Gi | 10 Gi |
| OpenSearch | Governance catalog index | 2 Gi | 30 Gi |
Use an SSD-backed storage class for PostgreSQL, object storage, and OpenSearch. HDD storage is acceptable for logs (logs layer, Airflow) and Prometheus.
Monitoring and Observability¶
AKKO deploys a full monitoring stack via the kube-prometheus-stack chart:
- Prometheus -- scrapes metrics from all AKKO services
- Dashboards -- pre-provisioned dashboards for every service
- Alertmanager -- routes alerts to Slack, email, or PagerDuty
- logs layer + log shipper -- centralized log aggregation
- OpenLineage -- data pipeline lineage (console transport by default)
Access Dashboards at https://grafana.<your-domain>.
Pre-Built Dashboards¶
AKKO ships with Dashboards for:
- Cluster overview (CPU, memory, network)
- Trino query performance
- Spark job metrics
- Airflow DAG execution
- PostgreSQL connections and queries
- object storage utilization
- Keycloak authentication events
Troubleshooting¶
Pods stuck in Pending¶
Common causes:
- Insufficient CPU/memory on nodes (scale up or reduce resource requests)
- No matching storage class (check kubectl get storageclass)
- Node selector mismatch (verify node labels)
Pods in CrashLoopBackOff¶
Common causes: - Database connection failures (PostgreSQL not ready yet -- check depends_on) - Incorrect passwords (mismatch between generated secrets and PV data) - OOM kills (increase memory limits)
TLS certificate issues¶
# Check cert-manager certificate status
kubectl -n akko get certificate
kubectl -n akko describe certificate <name>
# Check cert-manager logs
kubectl -n cert-manager logs deploy/cert-manager
Reset a service¶
# Delete the deployment and PVC, then let Helm recreate
kubectl -n akko delete deploy <service-name>
kubectl -n akko delete pvc <pvc-name>
helm upgrade akko ./helm/akko -n akko -f my-values.yaml
Further Reading¶
- Full Helm chart documentation -- complete reference with architecture diagram, scaling guide, and upgrade procedures
- Architecture overview -- how services connect
- Configuration reference -- environment variables and tuning
- Troubleshooting guide -- common issues and fixes