Skip to content

AKKO platform vs simulated customer environment — the 3 perimeters

TL;DR

The AKKO deployment on demo.akko-ai.com (Netcup) is a development and demonstration cluster. To prove customer value during a real prospect visit, we keep three perimeters strictly separated :

Perimeter What it represents Who owns it at a real customer
1. AKKO core The platform : cockpit, Keycloak (internal client store), OPA, ADEN, Trino, OpenMetadata, observability, state Postgres AKKO ships and operates this
2. Identity provider LLDAP populated like a real enterprise (50+ users in OUs Engineering / Finance / Compliance / Sales) — separated from AKKO The customer's existing AD / LDAP / Okta / Azure AD
3. Data sources Hive on a kerberized mini-Cloudera, plus Postgres / Oracle XE / MS SQL Express — separated from AKKO The customer's existing data warehouse, lakehouse, OLTP databases

The three perimeters live in three different Kubernetes namespaces on the demo cluster (akko, akko-demo-ad, akko-demo-cloudera, akko-demo-sources). AKKO core treats them as external endpoints — exactly as it would at a real customer site.

Why this matters

A prospect visiting demo.akko-ai.com should never wonder "but does this work with MY AD?" — because we can show them, live, that AKKO connects to a separate LLDAP populated as a realistic enterprise. The login happens against the demo AD ; AKKO did not provision those users. The federation is configured, not seeded.

Same for data : the prospect sees ADEN questioning a hive.warehouse.transactions table that lives on a kerberized HDFS+Hive cluster running in its own namespace. AKKO did not seed Hadoop ; it discovered and federated it. This is the same one-shot configuration a customer admin would run on their existing Cloudera CDP installation.

What "AKKO core" really is

The Helm umbrella chart akko ships these sub-charts as the platform :

  • cockpit — single-page admin app, the only thing end-users browse
  • akko-keycloak — local IdP for the AKKO realm (clients, service accounts) ; user federation is configured to point at the customer's AD
  • akko-opa — fine-grained authorization decisions for Trino, ADEN, ai- service
  • akko-aden — natural-language → SQL → dashboard pipeline, scope-first OPA + multi-tier cache + vector semantic catalog (ADR-041/042/043)
  • akko-ai-service — LLM RBAC layer
  • trino + akko-polaris — federation + Iceberg query engine
  • openmetadata — semantic catalog
  • akko-postgres — platform state (audit log, ADEN cache, dashboards metadata)
  • akko-storage (SeaweedFS) — object storage layer
  • akko-observability (Perses + VictoriaLogs + Prometheus + alertmanager)

A real customer install ships only these charts. No demo data. No seeded users. No demo databases. The customer connects their existing sources after helm install finishes, via :

  • Keycloak User Federation pointing at their AD / LDAP
  • The cockpit "Add catalog" page registering Trino catalogs against their Hive / Postgres / Oracle / SQL Server endpoints

What lives in the simulated customer perimeters

These are demo-only charts that ship in the AKKO repo for showing the product, but never get installed at a real customer :

  • akko-demo-ad — LLDAP + 50-user seed + 4 OUs + a small role-mapping CSV
  • akko-demo-cloudera — namenode + datanode + Hive Metastore + KDC MIT Kerberos + a warehouse schema with 10k seed rows
  • akko-demo-sources — Postgres climascore-db (real-shape data) + Oracle XE 21c + MS SQL Express, each with realistic seed schemas

A customer who deploys the umbrella would explicitly set akko-demo-ad.enabled=false, akko-demo-cloudera.enabled=false, akko-demo-sources.enabled=false. The chart-by-default already has them disabled (see Sprint 56 D6 zero-hardcoding pass).

The user story for a prospect demo

  1. Founder opens https://identity.demo.akko-ai.com (the LLDAP web UI of the simulated customer AD) and shows 50 users grouped in Engineering, Finance, Compliance, Sales. Picks one — say j.dupont@acme.demo.
  2. Founder opens https://demo.akko-ai.com, logs in as j.dupont (whose credentials live in the demo AD, not in AKKO). The cockpit shows the right role badge derived from the AD group Finance → AKKO role akko-analyst.
  3. Founder opens the "Catalogs" page : 5 baseline catalogs (Postgres state, tpch, tpcds, jmx, iceberg) — those are AKKO platform internals — plus hive, climascore, oracle_demo, mssql_demo — those are the simulated customer sources. Each card shows its tech badge.
  4. Founder asks ADEN : "Show the top 10 spending merchants this quarter from the warehouse". ADEN's reasoning panel shows it queried Milvus, filtered the candidate tables via OPA allowed_tables for j.dupont, asked the LLM to generate the SQL against hive.warehouse.transactions, ran it via Trino with j.dupont's identity, returned 10 rows.
  5. Founder asks a cross-source query : "Match those merchants with the ones in our climascore database that have a high climate risk score". ADEN generates a JOIN between hive.warehouse.transactions and climascore.public.scores, executes it via Trino.

Throughout, the prospect sees that AKKO did not seed the users, did not seed the data, did not install agents on Hadoop. AKKO is the federation + governance + AI layer ; their data and their AD stay where they are.

Sprint plan

The split is being delivered in three sprints — see akko-technical-map/sprints/sprint-61-3-perimeters-vision.md (private repo) for the full breakdown :

  • Sprint 61.1 — extract LLDAP into akko-demo-ad namespace, populate realistic users, configure Keycloak User Federation by Helm values.
  • Sprint 61.2akko-demo-cloudera chart : HDFS + Hive Metastore + KDC + seed data + Trino federation via Hive connector.
  • Sprint 61.3akko-demo-sources : Postgres / Oracle XE / MS SQL Express, each registered through the cockpit Add Catalog UI.

Installing perimeter 2 — akko-demo-ad (Sprint 61.1)

The standalone chart lives at helm/akko-demo-ad/. It is separate from the umbrella akko chart — installed as its own Helm release in its own namespace.

Quick install (demo cluster)

# 1. Generate two passwords out-of-band, NEVER commit them
ADMIN_PW=$(openssl rand -hex 16)
USER_PW=$(openssl rand -hex 16)

# 2. Create the namespace + install the chart
kubectl create ns akko-demo-ad
helm install akko-demo-ad helm/akko-demo-ad/ -n akko-demo-ad \
  -f helm/examples/values-demo-ad.yaml \
  --set adminPassword="$ADMIN_PW" \
  --set bootstrap.defaultUserPassword="$USER_PW"

# 3. Wait for the bootstrap Job to finish (50 users, 9 groups)
kubectl -n akko-demo-ad wait --for=condition=complete job \
  -l app.kubernetes.io/component=bootstrap --timeout=5m

# 4. Verify : a user from the bootstrap set should authenticate against the
#    Service DNS (not via the Ingress — DNS-only egress for the chart).
kubectl -n akko-demo-ad logs job -l app.kubernetes.io/component=bootstrap \
  | grep "auth verification"
# expected: [+] auth verification: 50 pass / 0 fail (out of 50)

Wiring AKKO core to the demo AD

In your AKKO umbrella values overlay (e.g. values-netcup.yaml), point Keycloak User Federation at the in-cluster Service of the demo AD :

akko-keycloak:
  userFederation:
    enabled: true
    provider: ldap
    url: "ldap://akko-demo-ad-akko-demo-ad.akko-demo-ad.svc.cluster.local:3890"
    bindDn: "uid=admin,ou=people,dc=akko,dc=local"
    baseDn: "dc=akko,dc=local"
    usersDn: "ou=people,dc=akko,dc=local"
    bindCredentialSecret:
      name: akko-demo-ad-federation
      key:  bind-password

Copy the admin password from akko-demo-ad/akko-demo-ad-akko-demo-ad secret into akko/akko-demo-ad-federation so Keycloak can bind read-only :

ADMIN_PW=$(kubectl -n akko-demo-ad get secret akko-demo-ad-akko-demo-ad \
  -o jsonpath='{.data.admin-password}' | base64 -d)
kubectl -n akko create secret generic akko-demo-ad-federation \
  --from-literal=bind-password="$ADMIN_PW"

Real customer install (BYO-IdP)

Skip the akko-demo-ad install entirely. Point the userFederation URL at the customer's existing directory, store the bind credentials in a secret you create out-of-band, and the same AKKO release works unchanged.

Bootstrap data shape

The chart ships 50 users distributed across 4 OUs and 5 platform-role groups :

OU group Users Platform-role distribution
org-engineering 20 mix of admins / engineers / analysts / stewards / viewers
org-finance 12 mostly analysts + viewers + 2 stewards
org-compliance 10 mostly stewards + viewers
org-sales 8 viewers + 2 analysts

The 5 platform-role groups (akko-admins, akko-engineers, akko-analysts, akko-stewards, akko-viewers) map 1:1 to AKKO realm roles via the global.auth.ldap.roleMapping setting on the AKKO umbrella release.

Anti-patterns we refuse

  • Hardcoding : no role name, hostname, password, user account in code. Everything via Helm values + secrets ; chart-by-default leaves them empty (audit-hardcoded-identities.py returns 0 hits).
  • Bricolage : every fix is permanent. No verifyJwt: false workaround hiding a real issuer mismatch. No undocumented kubectl set env that evaporates on the next helm upgrade.
  • Vendor naming : pages, env vars, dashboards are layer-named (e.g. dashboards, metrics, logs), never vendor-named (Grafana, Loki, MinIO). The customer doesn't care which engine is under the hood.
  • Half-migrations (Sprint 39.5 pattern) : when a swap happens, sweep every consumer-side reference in the same PR + add a regression gate.

See also

  • sprint-61-3-perimeters-vision.md (private) — full sprint breakdown
  • customer-onboarding.md — how a real customer installs AKKO core
  • ADR-039 — no hardcoded identities
  • ADR-041 / 042 / 043 — ADEN scope-first OPA + multi-tier cache + vector catalog