Skip to content

08 — Troubleshooting

Time : reference  ·  Persona : any  ·  Path : A or B

A pragmatic catalogue of the issues you are most likely to hit during onboarding, with the proven fix.

Pre-requisites

  • You have at least one platform handy (https://demo.akko-ai.com or your local https://demo.akko.local).

Quick health check

Run this first whenever something feels off :

# Live demo
curl -sk https://demo.akko-ai.com/api/health/all | jq

# Local k3d
curl -sk https://demo.akko.local/api/health/all | jq

Expected : every layer returns { "status": "ok" }.

If any layer is degraded, the rest of this page tells you which knob to turn.


1. "Service shows down" in the cockpit

Cause Fix
Sub-chart still rolling kubectl get pods -n akko and wait for Running.
Sub-chart crash loop kubectl logs -n akko <pod> → look for CrashLoopBackOff root cause.
Network policy blocks the health probe kubectl describe pod -n akko <pod> → "egress denied". Update the NP.
Health endpoint moved in a new release kubectl describe svc -n akko <svc> then update the cockpit health route map.
External dependency down If your catalog source is offline, the source health goes red, not the AKKO layer.

First reflex. Look at the cockpit Logs page filtered by the failing service before diving into kubectl.


2. SSO redirect loop

You sign in, you bounce back to the login screen, you sign in again, you bounce again…

Cause Fix
Third-party cookies blocked Allow cookies for *.akko-ai.com (or your local domain).
Safari ITP cookie partitioning Open the cockpit in a non-private window; ensure "Prevent cross-site tracking" doesn't strip the SSO cookie.
Clock skew between browser and identity provider Sync your laptop clock.
Cookies left over from a previous deploy Clear cookies for the AKKO domain, sign in again.
Identity provider client redirect-uri mismatch Run bash helm/scripts/keycloak-rest-patch-redirect-uris.sh (local) or wait for the redirect-uri sync hook (live).

3. ADEN result is empty

Cause Fix
Catalog just reseeded Wait 30 sec, click Generate again.
Scope excludes your tables Switch scope chip to one your persona owns.
Row filter masked all rows Try a less restrictive question (e.g. drop the region filter).
Semantic-layer match score very low Reword the question with explicit table names.
Engine warmup Second run is < 3 sec — retry.

4. Playwright tests skip

You ran pytest tests/e2e/playwright/ and most tests reported SKIPPED.

Cause Fix
Identity provider FQDN drift (e.g. auth.<domain> instead of identity.<domain>) The suite auto-resolves identity / keycloak / auth. If all three fail, check the frontal routing.
Cluster not reachable curl -sk https://demo.akko-ai.com/ should return 302. If 000, check DNS / VPN.
Persona not seeded bash helm/scripts/seed-keycloak-personas.sh.
Wrong domain env AKKO_DOMAIN=demo.akko-ai.com pytest ….
Cold start, identity provider returning 504 Wait 30 sec then re-run.

Skip-cleanly is by design : the suite refuses to assert green on a half-up platform. Read the skip reason in the pytest output.


5. Helm upgrade failed

Cause Fix
You used --reset-values NEVER use --reset-values. Always --reuse-values -f helm/examples/values-netcup.yaml.
Sub-chart image cached as docker.io/library/akko-X:tag Containerd cache shadow. Re-tag with the local registry prefix.
Realm file not passed via --set-file Add --set-file akko-keycloak.realm.data=helm/examples/realm-akko-k3d.json.
Half-migration : _required("AKKO_FOO") in code without the matching helm env: name: AKKO_FOO Update both code and helm in the same PR.
Pod waits for a Secret that the chart didn't create Check kubectl describe pod for FailedMount.

Golden rule. Helm upgrades should never need a CLI mutation. If you reached for kubectl set image, capture the gap as a chart fix and reissue the upgrade.


6. The cockpit shows the previous build

Cause Fix
Image tag reused (e.g. akko-cockpit:2026.05) Run bash scripts/deploy-cockpit.sh — it timestamps the tag.
Browser cache Hard refresh (Cmd-Shift-R).
Service worker holding the old bundle DevTools → Application → Service Workers → Unregister, refresh.
Pod rollout pending kubectl rollout status deploy/akko-cockpit -n akko.

7. Local k3d : pods stay Pending

Cause Fix
Out of memory on Docker Desktop Raise to 16 GB in Docker Desktop preferences.
Out of disk docker system prune -f.
imagePullBackOff on a custom image Run bash helm/scripts/build-images.sh to push to the k3d local registry.
PVC unbound k3d local-path provisioner missing; re-run bash helm/scripts/k3d-create.sh.


When in doubt

  • Check the cockpit Alerts page — it bubbles up actionable alerts before anyone has to ask.
  • Open DEMO_HOST/docs/admin/troubleshooting/ for the full troubleshooting handbook.
  • File an issue on github.com/AKKO-p/AKKO with the curl health output attached.

Next : 09 — Next steps.