ADR-040 — Cockpit backend with Keycloak service account (no user-bearer pass-through)¶
- Status : Accepted (2026-04-27)
- Stewards : platform-security, cockpit-team
- Supersedes : (none — companion to ADR-039)
- Related : ADR-038 (sub-based OIDC matching), gotcha_keycloak_realm_drift, feedback_no_bricolage_research_first
Context¶
The cockpit Gouvernance page (and its tabs Data Access, LLM RBAC, audit
trail builder) needs to read and write Keycloak data : list groups,
search users, attach role mappings, persist attributes['data-policies']
on groups, etc. Today the JS calls /api/keycloak/admin/{groups,users,...}
which is reverse-proxied by the akko-cockpit nginx pod with the user's
own access token forwarded as Bearer to Keycloak Admin REST.
Two problems with that design :
-
It does not work out of the box. The access token issued for the
oauth2-proxyclient does not includerealm-managementin its audience claim. Keycloak Admin REST returns 401, and the cockpit surfacesERROR: Keycloak admin call denied. We patched it with a protocol mapper (oidc-audience-mapperforrealm-management) — see the post-upgrade hook inakko-keycloak/templates/post-upgrade-audience-mapper-hook.yamland the realm template patch inhelm/examples/realm-akko-k3d.json. This fix unblocks the demo but does not address the design issue. -
It violates the principle of least privilege. Forwarding a user's bearer token to the IdP admin plane means every cockpit tab carries
realm-adminprivileges by chain — any XSS on cockpit, any compromised user withakko-admin, becomes a Keycloak realm admin. The blast radius is the entire identity provider for AKKO.
Decision¶
We will introduce a dedicated akko-cockpit-backend sub-chart (FastAPI
or Go) that holds the only credentials authorised to call Keycloak
Admin REST. The browser will only call our own backend, never Keycloak
directly.
High-level shape¶
Browser
│
│ Authorization: Bearer <user-access-token>
▼
oauth2-proxy ─(ForwardAuth, sets X-Auth-Request-User/Email/Groups)──┐
│ │
│ /api/governance/groups, /api/governance/policies, … │
▼ │
akko-cockpit-backend ◄── reads X-Auth-Request-* + verifies JWT ────┘
│ (validates `akko-admin` group; rejects 403 otherwise)
│
│ Authorization: Bearer <service-account-token> (client_credentials grant)
▼
Keycloak Admin REST /admin/realms/akko/{users,groups,…}
Authorization layers¶
- oauth2-proxy validates the user session cookie and injects
X-Auth-Request-{User,Email,Groups,Access-Token}headers. - akko-cockpit-backend validates the JWT signature against
https://identity.<domain>/realms/akko/protocol/openid-connect/certs, checks thegroupsclaim containsakko-admin(or whatever role the endpoint requires), and writes an audit row(actor_sub, actor_email, action, target, ts)to PostgreSQL before performing the admin call. - The service account (Keycloak client
akko-cockpit-backendwithserviceAccountsEnabled: true, grantedrealm-admin) is the only identity that ever talks to Keycloak Admin REST.
Why this is the enterprise norm¶
- Apache Ranger / Knox : the Ranger Admin UI calls the Ranger REST API which holds Hadoop service-impersonation credentials. The browser never carries Hadoop kerberos tickets.
- OpenShift Console : the console pod uses an OAuth client + ServiceAccount token to call kube-apiserver. The browser cookie talks to the console only.
- HashiCorp Boundary admin UI : same split. UI ↔ Boundary controller (with priv creds) ↔ targets.
- Snowflake / Dremio / DataHub : same shape. Frontend calls a backend that holds the priv creds for the identity store.
What we explicitly avoid¶
- Forwarding user bearers to the IdP admin plane — even with the audience mapper in place, this creates a privilege escalation path through cockpit JS.
- A "thin" cockpit backend that just relays calls — the backend must enforce its own RBAC + write its own audit log. Otherwise the delta from "nginx proxy with bearer" is zero.
Consequences¶
Positive
- Single point of audit (cockpit-backend → Postgres
akko_audit.cockpit_events). - Service account credentials stored in Vault/k8s Secret only, rotated
via existing
helm/scripts/generate-dev-secrets.shflow. - Frontend code shrinks : no more
kcFetchwrapper, no more 401/403 handling spread acrossbranding/cockpit/app.js. - Keycloak
oauth2-proxyclient can drop therealm-managementaudience mapper once cockpit no longer calls Admin REST directly. - ADR-039 alignment : per-tenant cockpit-backend with the customer's IdP client credentials, never AKKO-side hardcoded admin keys.
Negative
- New sub-chart to maintain (
akko-cockpit-backend) — Sprint 57.5 estimate : 5 days (3 dev + 2 test + audit log schema). - Each governance feature in cockpit JS needs a backend handler ;
current
kcFetch(...)call sites becomefetch('/api/governance/…'). Total ~25 endpoints (groups, users, role-mappings, scope-mappings, client-scopes, attributes). - Migration is staged : we keep the audience-mapper hook + nginx proxy active until the backend covers ≥100 % of cockpit governance JS calls.
Implementation roadmap¶
| Sprint | Deliverable |
|---|---|
| Sprint 56 (today) | Audience mapper short-term fix + this ADR + xfail E2E test + memory entry. Demo unblocked. |
| Sprint 57.5 D1-2 | Scaffold akko-cockpit-backend sub-chart : FastAPI app, NetworkPolicy, Service, Deployment, Ingress (/api/governance/*). Service account akko-cockpit-backend in realm with realm-admin clientRole. Vault-stored client secret. |
| Sprint 57.5 D3 | First three endpoints migrated : GET /groups, POST /groups/{id}/role-mappings/realm, PUT /groups/{id} (attributes). Each writes audit row + checks akko-admin group. Backend pytest test suite (12 cases). |
| Sprint 57.5 D4 | Remaining 22 endpoints migrated. Cockpit JS removes kcFetch. nginx /api/keycloak/admin/ location deleted from akko-cockpit/templates/configmap.yaml. |
| Sprint 57.5 D5 | Audience mapper hook + realm patch removed. E2E test test_persona_alice_governance.py flips from xfail to required. PR + ADR-040 closeout. |
Open questions¶
- Audit retention — 90 days hot in Postgres, 7 years cold in Iceberg (DPIA Art. 30 alignment). To resolve in Sprint 57.5 D2.
- Backend language — FastAPI is the AKKO default (akko-rag, ADEN, catalog-manager). No reason to deviate.
- Per-tenant credential isolation — cockpit-backend must not hardcode
the realm name "akko" ; templated via env var
AKKO_KEYCLOAK_REALMderived fromglobal.auth.realm(default:akko, customer override via Helm value).
References¶
- gotcha_keycloak_realm_drift (why Keycloak realm imports are not idempotent, and why we run post-upgrade Helm hooks)
- ADR-038 (
subclaim is the stable identifier — same matching applies to the cockpit-backend audit log) - ADR-039 (no hardcoded identities — applies to the service account credentials, which must come from the customer's secret store, not the umbrella chart's defaults)
- feedback_no_bricolage_research_first (this ADR is the research-first rationale documented as required by the rule)
- Apache Ranger Admin Architecture (Ranger Admin Server + REST API + Postgres) — public docs at https://ranger.apache.org/
- OpenShift Console source (
pkg/server/middleware.go, service account bearer pattern)