Client install — AKKO on your own Kubernetes cluster¶
This page is the canonical install guide for customers and partners running AKKO on their own Kubernetes cluster. It targets the install-akko-client.sh one-command path : a single script turns a freshly provisioned cluster into a fully functional AKKO platform under one wildcard domain.
Audience : platform engineers, SREs and admins who own the target cluster. Time budget : 30 minutes hands-on, plus 10 to 25 minutes of automated wait time for pods to come online.
Five-command summary
git clone https://github.com/AKKO-p/AKKO && cd AKKO
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.5/cert-manager.yaml
kubectl wait --for=condition=Available --timeout=180s deployment --all -n cert-manager
# Option A — production : customer brings own AD/LDAP
bash helm/scripts/install-akko-client.sh --domain=mycorp.example \
--customer-values=values-mycorp.yaml
# Option B — demo / PoC : bundled LLDAP with 50 personas
bash helm/scripts/install-akko-client.sh --domain=mycorp.example --with-demo-ldap
kubectl -n akko get pods
1. Prerequisites¶
The install script enforces preflight checks (Phase 1) and aborts with a clear message if any of the items below is missing. Provision them first.
1.1 Cluster¶
| Requirement | Minimum | Notes |
|---|---|---|
| Kubernetes API | 1.28+ | Tested on 1.28 to 1.30. EKS / GKE / AKS / OVHcloud Managed / OpenShift 4.14+ / vanilla kubeadm / k3s / k3d all supported. |
| Worker nodes | 2 nodes | 1 node works for dev but leaves no spare capacity. |
| CPU per worker | 8 vCPU | 4 vCPU minimum, but Spark / Trino / OpenMetadata will queue. |
| Memory per worker | 32 GB | 24 GB minimum. OpenMetadata + OpenSearch alone need 2.5 GB. |
| Container runtime | containerd 1.7+ or CRI-O 1.28+ | Docker shim no longer supported by upstream Kubernetes. |
| Architecture | amd64 or arm64 | Apple Silicon clusters supported. All custom images are multi-arch. |
1.2 Storage¶
| Requirement | Detail |
|---|---|
| Default StorageClass | One must be marked storageclass.kubernetes.io/is-default-class=true. The preflight warns if missing. |
| Provisioner mode | ReadWriteOnce is sufficient — every PVC is bound to a single pod. |
| Volume expansion | Recommended (allowVolumeExpansion: true) so you can grow Postgres without a snapshot. |
| Capacity headroom | 200 GB across the cluster for a small demo, 1 TB+ for production. |
A managed CSI driver (gp3 on EKS, pd-balanced on GKE, premium-LRS on AKS, Longhorn on bare-metal) works out of the box.
1.3 Networking¶
| Requirement | Detail |
|---|---|
| Ingress controller | Traefik 2 (deployed by the chart) or your own controller. If you bring your own, disable the bundled Traefik via traefik.enabled=false and point a wildcard *.<domain> at your LoadBalancer. |
| Wildcard DNS | One *.<domain> A or CNAME record pointing to the cluster ingress. |
| Outbound HTTPS | The cluster needs to reach the chosen container registry (Harbor mirror or Docker Hub / GHCR). Air-gapped installs are documented separately in the admin guide. |
| TLS | cert-manager 1.14+ with a working ClusterIssuer. The preflight warns when cert-manager is missing. |
1.4 CLI tooling on the install host¶
| Tool | Minimum | Install |
|---|---|---|
helm |
3.14 | brew install helm or your package manager |
kubectl |
1.28 | brew install kubectl |
git |
any | apt install git / brew install git |
openssl |
1.1 or 3.x | preinstalled on Linux and macOS |
bash |
4+ | macOS ships with 3.2 — install GNU bash via Homebrew when running from a Mac |
Verify with :
Expected output : helm reports v3.14.x+, kubectl reports v1.28.x+, bash reports 4.x or 5.x.
2. Identity provider configuration¶
Sprint 61.1 (ADR-045) split AKKO into three perimeters. Knowing which is which determines how you configure user login.
+----------------------------+---------------------------+--------------------------+
| Perimeter | Namespace | Who owns it |
+----------------------------+---------------------------+--------------------------+
| AKKO core | akko | AKKO ships it |
| | | (this umbrella chart) |
| Identity (LDAP / AD) | <customer choice> | CUSTOMER brings own AD |
| Data sources (DB / lake) | <customer choice> | CUSTOMER brings own data |
+----------------------------+---------------------------+--------------------------+
The AKKO core chart NEVER ships an LDAP server. Production installs federate Keycloak against the customer's existing AD / OpenLDAP / 389DS / Microsoft Entra ID. Two installation modes are supported.
2.1 Option A — Customer brings their own AD / LDAP (production)¶
Use this path when you run AKKO on top of an existing corporate directory. Pre-requisites :
| Item | Detail |
|---|---|
| Reachable LDAP endpoint | ldap:// (port 389) or ldaps:// (port 636) reachable from the akko namespace. Open the relevant EgressNetworkPolicy or firewall hole. |
| Service account | A bind DN with read-only access to the user subtree. Never reuse a human admin DN. |
| Read-only access | Keycloak operates the federation in READ_ONLY mode by default. AKKO never writes back to the customer directory. |
| Group → role mapping | The customer publishes the AD/LDAP groups that should grant access to each AKKO platform role. |
Step 1 — Stage the LDAP bind password. The install script has two options ; pick one.
Option 1.a — --ldap-bind-pwd-file (recommended, Sprint 119). Drop the bind password into a single-line file with mode 0600 and pass its path to the install script. The script creates the akko-keycloak-bind Secret in the akko namespace before helm install runs, and the pre-flight check verifies the file is non-empty.
# 1. Write the bind password to a secured file (chmod 0600).
umask 077
printf '%s' '<your-LDAP-bind-password>' > /tmp/akko-ldap-bind.txt
# 2. Pass the file via the new flag (see Step 3 below).
# The script will stage Secret akko/akko-keycloak-bind for you.
Option 1.b — pre-create the Secret manually. Useful when you wire credentials through Sealed Secrets, External Secrets Operator, Vault, etc.
kubectl create namespace akko --dry-run=client -o yaml | kubectl apply -f -
kubectl -n akko create secret generic akko-keycloak-bind \
--from-literal=bind-password='<your-LDAP-bind-password>'
Either path works ; the chart only needs the Secret to exist by the time the post-install user-federation hook runs. The Sprint 119 pre-flight check ABORTS the install with a clear remediation message if userFederation.enabled: true is set in --customer-values but neither option above was applied.
Step 2 — Copy the customer values template and uncomment / fill in the akko-keycloak.userFederation block :
Active Directory example :
akko-keycloak:
userFederation:
enabled: true
provider: ldap
vendor: ad
displayName: corp-ad
url: "ldaps://dc01.corp.local:636"
bindDn: "CN=svc-keycloak,OU=Service Accounts,DC=corp,DC=local"
bindCredentialSecret:
name: akko-keycloak-bind
key: bind-password
baseDn: "DC=corp,DC=local"
usersDn: "CN=Users,DC=corp,DC=local"
customUserSearchFilter: "(&(objectclass=user)(!(objectclass=computer)))"
uuidLDAPAttribute: "objectGUID"
usernameLDAPAttribute: "sAMAccountName"
userObjectClasses: "user"
editMode: READ_ONLY
syncRegistrations: false
importEnabled: true
fullSyncPeriod: 86400
changedSyncPeriod: 300
compositeMappings:
- { childRole: "DataPlatformAdmins", parentRole: "akko-admin" }
- { childRole: "DataEngineers", parentRole: "akko-engineer" }
- { childRole: "BusinessAnalysts", parentRole: "akko-analyst" }
- { childRole: "DataStewards", parentRole: "akko-steward" }
- { childRole: "ReadOnlyConsumers", parentRole: "akko-viewer" }
OpenLDAP / 389DS example :
akko-keycloak:
userFederation:
enabled: true
vendor: other
url: "ldap://ldap.corp.internal:389"
bindDn: "cn=keycloak,ou=services,dc=corp,dc=internal"
bindCredentialSecret:
name: akko-keycloak-bind
key: bind-password
baseDn: "dc=corp,dc=internal"
usersDn: "ou=people,dc=corp,dc=internal"
customUserSearchFilter: "(objectclass=inetOrgPerson)"
uuidLDAPAttribute: "entryUUID"
usernameLDAPAttribute: "uid"
userObjectClasses: "inetOrgPerson"
editMode: READ_ONLY
Microsoft Entra ID (Azure AD) is not wired through User Federation — it is an OIDC IdP. Wire it via Keycloak Identity Providers (Sprint 61.4 Entra path, see docs/admin/external-idp.md). The values surface is akko-keycloak.identityProviders[], not userFederation.
Step 3 — Launch the install. If you used Option 1.a (--ldap-bind-pwd-file), pass it on the command line so the script creates the Secret in Phase 3c :
bash helm/scripts/install-akko-client.sh \
--domain=mycorp.example \
--customer-values=values-mycorp.yaml \
--ldap-bind-pwd-file=/tmp/akko-ldap-bind.txt
If you used Option 1.b (Secret already created), omit --ldap-bind-pwd-file ; the pre-flight detects the existing akko/akko-keycloak-bind Secret and proceeds :
bash helm/scripts/install-akko-client.sh \
--domain=mycorp.example \
--customer-values=values-mycorp.yaml
The pre-flight detects the userFederation block and skips the no-IdP warning. Smoke verification at the end exercises the cockpit login redirect to Keycloak.
2.2 Option B — Bundled demo LLDAP (try AKKO without an AD)¶
Use this path for PoC, demo days, on-boarding workshops, internal evaluation. The standalone helm/akko-demo-ad chart provisions LLDAP in its own namespace (akko-demo-ad) with 50 personas across 4 organisational units and 5 AKKO platform roles.
The script :
- Installs
helm/akko-demo-adin namespaceakko-demo-adwith random admin + user passwords. - Mirrors the demo admin password into
akko/akko-demo-ad-federationso Keycloak User Federation binds successfully. - Installs the AKKO umbrella with the demo
userFederationoverlay pre-baked invalues-netcup.yaml(or whatever overlay you stack on top).
Reference personas (suffix matches the AKKO platform role) :
| Username | First | Last | AKKO role |
|---|---|---|---|
alice_admin |
Alice | Martin | akko-admin |
bob_engineer |
Bob | Lefevre | akko-engineer |
carol_analyst |
Carol | Petit | akko-analyst |
dave_steward |
Dave | Bernard | akko-steward |
eve_viewer |
Eve | Dubois | akko-viewer |
Login passwords are random per install. Read them from the demo-ad Secret :
kubectl -n akko-demo-ad get secret akko-demo-ad \
-o jsonpath='{.data.default-user-password}' | base64 -d
Or — if you used the per-user password mechanism in helm/akko-demo-ad/values.yaml bootstrap.userPasswords — use the password you set there (the canonical demo values file ships alice / alice123, bob / bob123, etc., for live walkthrough readability).
Never use --with-demo-ldap in production. Anyone with the cluster's external URL plus the (auto-generated but discoverable) seed-user password can sign in as alice_admin and gain akko-admin.
3. Five-command install¶
The repository ships a single entry point at helm/scripts/install-akko-client.sh. It runs nine phases : preflight, domain values, secrets generation, optional cluster cleanup, helm install --atomic, pod wait, init job verification, smoke test, deployment report.
# Step 1 — Clone the repo
git clone https://github.com/AKKO-p/AKKO
cd AKKO
# Step 2 — Install cert-manager (skip if already present)
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.5/cert-manager.yaml
kubectl wait --for=condition=Available --timeout=180s deployment --all -n cert-manager
# Step 3 — Point the wildcard DNS at the ingress IP
# (use your DNS provider — example here with a fictional CNAME)
# *.mycorp.example CNAME k8s-ingress.mycorp.example.
# Step 4 — Launch the install (15 to 25 minutes walltime)
# Production : wire to the customer's existing AD (see section 2.1)
bash helm/scripts/install-akko-client.sh --domain=mycorp.example \
--customer-values=values-mycorp.yaml
# Demo / PoC : bundled LLDAP with 50 personas (see section 2.2)
bash helm/scripts/install-akko-client.sh --domain=mycorp.example --with-demo-ldap
# Step 5 — Verify
kubectl -n akko get pods
Expected final output of step 4 :
[install-akko] ===== Phase 9/9 — Summary =====
[install-akko] AKKO platform installed on https://*.mycorp.example/
[install-akko] Cockpit : https://cockpit.mycorp.example/
[install-akko] Identity : https://identity.mycorp.example/
[install-akko] Report : /root/akko-host-config/deployment-report-20260529-150201.md
[install-akko] Secrets file : helm/examples/values-dev-secrets.yaml
The generated deployment report under ~/akko-host-config/ carries every URL, every admin password and the troubleshooting playbook. Keep it under restrictive permissions (the script chmods it 0600).
4. CLI flag reference¶
The script accepts a stable set of flags. None of them is positional ; only --domain is mandatory.
| Flag | Type | Default | Effect |
|---|---|---|---|
--domain=<fqdn> |
required | — | Wildcard domain. Every service URL is derived from it (cockpit.<fqdn>, identity.<fqdn>, etc.). |
--namespace=<ns> |
optional | akko |
Kubernetes namespace. Created with --create-namespace. |
--release=<name> |
optional | akko |
Helm release name. Useful when running multiple isolated installs on the same cluster. |
--chart=<path> |
optional | helm/akko |
Local path to the chart. Ignored when --from-harbor is set. |
--clean |
flag | off | Run helm uninstall and kubectl delete ns BEFORE the install. Use with care — destroys data. |
--dry-run |
flag | off | Run helm template only, no apply. Useful for change review. |
--from-harbor |
flag | off | Pull the OCI chart from Harbor instead of using the local filesystem. |
--harbor-url=<url> |
optional | oci://harbor.akko-ai.com/akko-charts/akko |
Override the OCI registry. |
--chart-version=<v> |
optional | latest | Pin a specific chart version when pulling from Harbor. |
--no-rollback |
flag | rollback on | Disable automatic rollback when the smoke test fails. Use during debugging to keep the broken state on the cluster. |
--skip-preflight |
flag | preflight on | Skip the Phase 1 preflight checks. Not recommended outside CI. |
--skip-smoke |
flag | smoke on | Skip the five-endpoint smoke test. |
--timeout=<dur> |
optional | 30m |
helm install --timeout. Increase for cold-start clusters that need to pull all images sequentially. |
--secrets-file=<f> |
optional | helm/examples/values-dev-secrets.yaml |
Reuse an existing secrets values file (production : keep this file in Vault / SOPS / SealedSecrets, never in git). |
--customer-values=<f> |
optional | — | Layer this customer values file AFTER the standard stack (highest precedence). Typical use : a filled-in copy of helm/examples/values-customer-template.yaml carrying akko-keycloak.userFederation and customer OIDC issuer. See section 2.1. |
--with-demo-ldap |
flag | off | DEMO / PoC mode : installs the standalone helm/akko-demo-ad chart in its own namespace BEFORE the main release and auto-creates akko/akko-demo-ad-federation Secret. NEVER use in production. See section 2.2. |
-h, --help |
flag | — | Print the usage block embedded in the script. |
4.1 Exit codes¶
The script exits non-zero on any failure. Wire these into your CI signal.
| Code | Meaning |
|---|---|
0 |
Install succeeded, all smoke endpoints OK. |
1 |
Preflight failed (cluster unreachable, no StorageClass, etc.). |
2 |
helm install failed. No auto-rollback because nothing was deployed. |
3 |
Install succeeded but smoke test failed. Auto-rolled back unless --no-rollback. |
4 |
Init Jobs (Polaris bootstrap, Keycloak realm import, catalog seed) did not complete. |
5 |
Invalid CLI arguments. |
4.2 Common invocations¶
# Production install on customer domain, pulling chart from Harbor
bash helm/scripts/install-akko-client.sh \
--domain=acme.cloud \
--from-harbor \
--chart-version=2026.05.1 \
--timeout=45m \
--secrets-file=/etc/akko/secrets.yaml
# Dry-run for change review (no apply)
bash helm/scripts/install-akko-client.sh \
--domain=staging.acme.cloud \
--dry-run | tee install-plan.txt
# Wipe and reinstall on a dev cluster
bash helm/scripts/install-akko-client.sh \
--domain=akko.dev \
--clean \
--no-rollback
5. Customization¶
The chart is layered : every customization passes through Helm values files, never through code edits.
5.1 Layer values file pattern¶
helm/examples/
├── values-domain.yaml # auto-generated by generate-domain-values.sh
├── values-dev-secrets.yaml # auto-generated by install-akko-client.sh
├── values-netcup.yaml # reference production overlay
└── realm-domain.json # auto-generated Keycloak realm
To layer your own production overlay, drop a file next to the others and pass it explicitly :
bash helm/scripts/install-akko-client.sh \
--domain=acme.cloud \
--secrets-file=helm/examples/my-secrets.yaml
helm upgrade akko helm/akko -n akko \
-f helm/examples/values-netcup.yaml \
-f helm/examples/values-domain.yaml \
-f helm/examples/my-secrets.yaml \
-f helm/examples/my-overrides.yaml \
--set-file akko-keycloak.realm.data=helm/examples/realm-domain.json
5.2 Common settings¶
| Need | Values knob |
|---|---|
| Override default StorageClass per service | <service>.persistence.storageClass: longhorn |
| Pin image registry (mirror / air-gap) | global.image.registry: registry.internal/akko |
| Disable a layer (e.g. drop AI / ML) | ollama.enabled: false, litellm.enabled: false, akko-aden.enabled: false |
| Bring your own ingress controller | traefik.enabled: false and create your own Ingress resources |
| Bring your own LLM gateway | litellm.enabled: false and set global.llmGateway.url: https://your-gateway/v1 |
| Enable Linkerd mesh injection per service | global.serviceMesh.linkerdInject.<svc>: true |
| Resource overrides | <service>.resources.requests.memory: 4Gi, etc. |
| External Postgres for stateful services | <service>.externalDatabase.host: pg.internal, with credentials in a Secret |
5.3 Bring your own secrets¶
In production never let the install script generate secrets. Instead :
- Provision the 31 passwords through Vault / SOPS / SealedSecrets / External Secrets Operator.
- Render them into a values file matching the schema documented at the top of the auto-generated file.
- Pass that file via
--secrets-file=....
The script only generates secrets when the target file is absent. An existing file is reused unchanged.
6. Verification¶
The Phase 7 smoke test is intentionally minimal (five HTTP calls). Run the full verification cascade below before you call the install done.
6.1 Pods Ready¶
kubectl -n akko get pods -o wide
kubectl -n akko get pods --no-headers | awk '{print $3}' | sort | uniq -c
Expected : every pod in the akko namespace reports Running or Completed. Zero CrashLoopBackOff, zero ImagePullBackOff, zero Pending after 10 minutes. The reference demo cluster reports 55 Running and 12 Completed.
6.2 Init Jobs¶
Expected : every Job shows 1/1 under COMPLETIONS. Jobs typically include Keycloak realm import, Polaris catalog bootstrap, OpenMetadata seed, catalog ingestion, demo data load.
6.3 Public endpoints¶
The deployment report lists the full URL set. Verify the five critical surfaces :
DOMAIN=mycorp.example
for host in cockpit identity federation catalog bi; do
printf '%-12s ' "${host}"
curl -sk -o /dev/null -w 'HTTP %{http_code}\n' "https://${host}.${DOMAIN}/"
done
Expected output :
cockpit HTTP 200
identity HTTP 200
federation HTTP 401 # Trino requires Bearer token — 401 is healthy
catalog HTTP 200
bi HTTP 302 # Superset redirects unauthenticated traffic
6.4 Persona login¶
The bundled Keycloak realm provisions five personas (alice, bob, carol, dave, erin) mapped to the five platform roles (akko-admin, akko-engineer, akko-analyst, akko-viewer, akko-steward). Look up the bootstrap password in the deployment report under Admin credentials, then sign into the cockpit :
You should land on the home page, see the eight service tiles all green, and have the role chip in the top bar reflect your persona.
6.5 ADEN test query¶
Click the Ask AKKO button (cockpit top bar), or open https://aden.<your-domain>/, then submit :
Expected behaviour : ADEN translates the question to SQL against the bundled Trino federation, returns a numeric answer, exposes the executed SQL + the data lineage, and lets you pin the result to a BI dashboard. End-to-end response under 12 seconds on the reference cluster.
7. Troubleshooting¶
The following matrix collects the most common failure classes. Each has a deterministic fix recipe.
| # | Symptom | Likely cause | Fix |
|---|---|---|---|
| 1 | helm install --atomic hangs, then rolls back with Pod akko-cockpit-backend CreateContainerConfigError, Secret akko-cockpit-backend-creds not found |
A post-upgrade hook creates a Secret needed at boot. Fresh installs never reach the hook because --atomic aborts first. |
The umbrella templates/secrets.yaml ships a lookup+stub pattern for these Secrets. If you patched the chart, restore the stub. Otherwise rerun the install — the second run finds the existing stub and proceeds. |
| 2 | One service is unreachable through ingress with HTTP 502, but the pod is 1/1 Running and responds in-cluster |
NetworkPolicy on the service is missing the Traefik ingress allow-rule. | Confirm with kubectl -n akko get netpol <svc> -o yaml. If the rule is missing, the chart you forked drifted — re-add the akko.networkpolicy.traefikIngress helper or temporarily delete the policy : kubectl -n akko delete netpol <svc>. |
| 3 | Cockpit KPI shows Federation Down, even though Trino pod is 1/1 Running |
Trino refuses requests carrying X-Forwarded-For because http-server.process-forwarded defaults to false. |
Set trino.config.processForwarded: true in your overlay and helm upgrade. Already enabled by default in values-netcup.yaml. |
| 4 | A pod runs an old image despite a fresh docker push and kubectl rollout restart |
containerd cache pins the digest under docker.io/library/<image>:<tag> because the pod spec has no registry prefix, while the mutable tag was updated only in the local mirror. |
Always reference images by fully qualified registry name. Set image.pullPolicy: Always on every mutable tag. Wipe the cache with crictl rmi <image>:<tag> on the affected node. |
| 5 | helm upgrade succeeds but silently reverts the running image |
Mutable tag + pullPolicy: IfNotPresent + containerd cache combine to keep the first-pulled digest. |
Same fix as item 4 — set pullPolicy: Always for every custom akko-* image. Already enforced in values-netcup.yaml. |
| 6 | helm template writes upstreams like *.default.svc.cluster.local and nginx fails with host not found in upstream |
helm template defaults to namespace default regardless of the active kubectl context. |
Always pass -n akko (or your release namespace) to helm template. The install script already does this. |
| 7 | Superset Init Job fails with ImportError: jose.exceptions or joserfc not found |
Bitnami Superset image ships with joserfc 0.9 but OIDC code path imports jose.exceptions. |
Override the image to apache/superset:3.1.1 or ensure the superset.bootstrapScript carries the documented compatibility shim (see helm/akko/charts/akko-init/templates/init-superset-job.yaml). |
| 8 | Trino logs EventListener could not be created and the pod restarts |
The OpenLineage event listener jar is missing or the OpenMetadata target is unreachable at boot. | Boot order matters : OpenMetadata must be Ready before Trino. Either tag Trino with an init container that waits, or set trino.eventListener.enabled: false and re-enable after the catalog is up. |
| 9 | cert-manager issues 0 certificates, ingresses stay on the default self-signed Traefik cert |
The cert-manager ClusterIssuer is missing or the DNS / HTTP01 challenge does not reach the cluster. |
Check kubectl describe clusterissuer letsencrypt-prod. For HTTP01, confirm port 80 reaches Traefik. For DNS01, verify the credentials Secret in the cert-manager namespace. |
| 10 | Keycloak realm import fails with Realm akko already exists, import skipped after a --clean reinstall |
Postgres PVC was retained while Keycloak DB rows persist. | --clean deletes the namespace but does not always reclaim PVs with Retain policy. Delete leftover PVs : kubectl get pv | grep akko then kubectl delete pv <name>. |
| 11 | Users cannot log in / Keycloak shows an empty user list / cockpit redirects to Keycloak then errors with "Invalid user" | No Identity Provider configured. Keycloak admin can sign in but no end-user SSO works. | Verify Identity Provider configuration in values-customer.yaml (uncommented akko-keycloak.userFederation block + akko-keycloak-bind Secret created), or pass --with-demo-ldap to install the bundled demo AD. See section 2. |
| 12 | --with-demo-ldap install succeeds but Keycloak logs LDAPAuthenticationException: Could not authenticate |
The akko/akko-demo-ad-federation Secret carries the wrong password (re-run not idempotent across multiple --with-demo-ldap calls with --clean). |
Re-run the script with --clean so the demo-ad Secret is regenerated and re-mirrored. Alternative : kubectl -n akko-demo-ad get secret akko-demo-ad -o jsonpath='{.data.admin-password}' \| base64 -d then kubectl -n akko create secret generic akko-demo-ad-federation --from-literal=bind-password=<value> --dry-run=client -o yaml \| kubectl apply -f -. |
7.1 First-line diagnostic commands¶
Print this card on the wall :
NS=akko
kubectl -n $NS get pods -o wide | grep -v Running
kubectl -n $NS get events --sort-by=.lastTimestamp | tail -20
kubectl -n $NS describe pod <failing-pod>
kubectl -n $NS logs <failing-pod> --previous
helm history akko -n $NS
helm get values akko -n $NS > /tmp/akko-values.yaml
8. Day-2 operations¶
8.1 Helm upgrade¶
bash helm/scripts/install-akko-client.sh \
--domain=mycorp.example \
--from-harbor \
--chart-version=2026.05.2 \
--secrets-file=/etc/akko/secrets.yaml \
--no-rollback
The script uses helm upgrade --install, so the same entry point handles upgrades. The --no-rollback flag is recommended in production : a manual decision on rollback is safer than an auto-rollback that destroys the half-converged state during a debugging session.
Before any upgrade :
helm history akko -n akko— confirm the current revision and that the previous revision is healthy.velero backup create akko-pre-upgrade-$(date +%s) --include-namespaces akko— snapshot.- Run with
--dry-runfirst, diff against your overlay. - Run the real upgrade in a maintenance window.
8.2 Secret rotation¶
# Rotate Keycloak admin password
kubectl -n akko create secret generic akko-keycloak-new \
--from-literal=admin-password=$(openssl rand -base64 32)
# Patch the deployment to pick up the new Secret
kubectl -n akko set env deploy/akko-keycloak \
KEYCLOAK_ADMIN_PASSWORD=$(kubectl -n akko get secret akko-keycloak-new \
-o jsonpath='{.data.admin-password}' | base64 -d)
# Verify
kubectl -n akko rollout status deploy/akko-keycloak
For service-to-service Secrets (OIDC client secrets, Postgres passwords, Polaris root secret) follow the documented rotation runbook in docs/admin/security.md. The chart supports helm upgrade --reuse-values --set <path>=<new-value> for any of the global.auth.* keys.
8.3 Scale¶
The chart exposes replica counts for stateless services :
| Service | Knob | Default |
|---|---|---|
| Cockpit | akko-cockpit.replicaCount |
1 |
| ADEN | akko-aden.replicaCount |
1 |
| Superset | superset.supersetNode.replicaCount |
1 |
| Trino worker | trino.server.workers |
2 |
| Spark worker | akko-spark.worker.replicas |
2 |
| OAuth2 Proxy | oauth2-proxy.replicaCount |
2 |
Stateful services (Postgres, OpenSearch, Polaris, Keycloak) stay singleton in the default chart. For HA topologies see docs/admin/governance-architecture.md.
8.4 Backup and restore¶
# Postgres backup
kubectl -n akko exec sts/akko-postgresql -- \
pg_dumpall -U postgres > backup-$(date +%F).sql
# Object storage snapshot (SeaweedFS / S3 gateway)
kubectl -n akko exec deploy/akko-seaweedfs-filer -- \
weed backup -dir=/data -dest=/backup
For point-in-time disaster recovery wire Velero to a remote S3 target and snapshot the akko namespace daily. The DR playbook in docs/admin/dr-playbook.md covers the full DORA Article 11 scenario.
8.5 Monitoring¶
The chart ships an observability layer (Prometheus + Perses + VictoriaLogs + Fluent-Bit + Alertmanager). Access points :
| Surface | URL |
|---|---|
| Dashboards | https://metrics.<domain>/ (Perses) |
| Logs | https://logs.<domain>/ (VictoriaLogs UI) |
| Alerts | https://alerts.<domain>/ (Alertmanager) |
Default alerts trigger on PodNotReady, CertExpiresSoon, PVCFull, TrinoQueryLatencyP95High. Customize the rules under helm/akko/charts/akko-observability/configs/alerts/.
9. Uninstall¶
9.1 Graceful uninstall (keep data)¶
The PersistentVolumes are kept thanks to the default reclaim policy. Reinstalling under the same release name reattaches them. Use this path when you migrate between chart versions or when you need to test fresh credentials without losing the catalog.
9.2 Complete uninstall (wipe data)¶
helm uninstall akko -n akko --wait --timeout 5m
kubectl delete ns akko --wait --timeout 5m
kubectl get pv -o json \
| jq -r '.items[] | select(.spec.claimRef.namespace=="akko") | .metadata.name' \
| xargs -r kubectl delete pv
kubectl delete clusterrole,clusterrolebinding -l app.kubernetes.io/part-of=akko --ignore-not-found
After this sequence the cluster is back to its pre-install state. Verify with kubectl get all -A | grep akko returning empty.
9.3 Uninstall cert-manager (optional)¶
If cert-manager was installed solely for AKKO and is not used by another workload :
kubectl delete -f https://github.com/cert-manager/cert-manager/releases/download/v1.14.5/cert-manager.yaml
Appendix A — Boot ladder¶
The chart enforces a four-level boot order. Knowing this helps interpret transient failures during the install :
- L0 — Foundation : Postgres, Keycloak, SeaweedFS, Polaris reach
Ready. - L1 — Bootstrap : umbrella
secrets.yamlcreates the placeholder Secrets needed by L2 pods at first boot. - L2 — Platform : Cockpit, Trino, Airflow, Superset, ADEN, OpenMetadata, JupyterHub start in parallel.
- L3 — Post-install hooks : real OIDC client secrets replace the placeholders, redirect URIs are patched, the catalog is ingested, demo data is seeded.
If the install fails between L1 and L3, rerun the script. The umbrella chart is idempotent and the second pass finishes the hooks.
Appendix B — Where things land¶
| Artefact | Location |
|---|---|
| Deployment report | ~/akko-host-config/deployment-report-<timestamp>.md (mode 0600) |
| Generated secrets | helm/examples/values-dev-secrets.yaml (mode 0600, gitignored) |
| Generated domain values | helm/examples/values-domain.yaml |
| Generated Keycloak realm | helm/examples/realm-domain.json |
| Helm release | helm list -n akko |
| Workloads | kubectl -n akko get all |
Appendix C — Related documentation¶
docs/admin/customer-onboarding.md— onboarding checklist for a new customer.docs/admin/external-idp.md— wire your enterprise IDP into Keycloak.docs/admin/air-gapped.md— air-gapped deployment playbook.docs/admin/dr-playbook.md— disaster recovery procedure (DORA Article 11).docs/admin/security.md— image signing, SBOM, mTLS, encryption at rest.docs/admin/runbooks/index.md— operational runbooks for production incidents.