Security Flow¶
AKKO enforces a 4-layer defence that flows a user's identity from the browser through the data plane:
- Authentication — Keycloak SSO (single source of truth, 13 OAuth clients, 5 roles).
- Authorization — OPA (fine-grained, data-aware policies synced from Keycloak groups).
- Query enforcement — Trino row-filters + column-masks via the OPA plugin.
- Audit — structured logs (Loki) + Keycloak events + Trino query log + OPA decision log.
End-to-End Auth Flow¶
sequenceDiagram
participant B as Browser
participant T as Traefik (TLS)
participant O2P as oauth2-proxy
participant S as Service (Superset / JupyterHub)
participant KC as Keycloak
participant OPA as OPA
participant TR as Trino
participant OM as OpenMetadata
B->>T: https://bi.akko.local
T->>O2P: ForwardAuth check (services w/o native OIDC)
O2P->>KC: redirect to /auth (OIDC)
B->>KC: login (user + password / MFA)
KC-->>B: ID token + access token (with roles claim)
B->>S: request with bearer token
S->>KC: validate JWT (JWKS)
KC-->>S: ok
S->>TR: SQL with X-Trino-User + token
TR->>OPA: POST /v1/data/trino/allow {user, roles, query, table, columns}
OPA->>KC: pull group memberships (synced cache)
OPA-->>TR: allow + row_filters + column_masks
TR->>OM: resolve PII tags (cached)
TR->>TR: rewrite query (apply filters + masks)
TR-->>S: masked rows
S-->>B: chart / table
note over OPA,KC: OPA poll every 60s for Keycloak group sync
Roles and Groups¶
flowchart LR
subgraph KC[Keycloak realm akko]
R1[akko-admin]
R2[akko-engineer]
R3[akko-analyst]
R4[akko-user]
R5[akko-viewer]
G1[group: data-team]
G2[group: finance]
G3[group: svc-accounts]
end
subgraph LLDAP[LLDAP optional]
L1[corporate users]
L2[corporate groups]
end
LLDAP -.federated.-> KC
R1 -->|maps| OPA
R2 -->|maps| OPA
R3 -->|maps| OPA
R4 -->|maps| OPA
R5 -->|maps| OPA
G1 -->|project access| OPA
G2 -->|project access| OPA
G3 -->|machine tokens| OPA
Trino Row-Filter + Column-Mask¶
The OPA plugin rewrites every query before execution.
# package trino — excerpt
mask_email[columns] {
columns = {"email", "phone", "ssn", "dob"}
not roles[_] in {"akko-admin", "akko-engineer", "akko-analyst"}
}
row_filter[{"filter": f}] {
input.context.identity.groups[_] == "finance"
f := "region IN ('EU', 'APAC')"
}
Example: a user in role akko-user issues:
Trino rewrites this to:
SELECT '***masked***' AS email, amount
FROM iceberg.banking.transactions
WHERE region IN ('EU', 'APAC');
PII Tag Sync — OpenMetadata -> OPA¶
flowchart LR
OM[OpenMetadata<br/>column tags: PII, GDPR.Personal] -->|nightly ingest| REG[tag cache]
REG -->|/policies/pii| OPA
OPA -->|column_mask| TR[Trino]
The opa-sync sidecar polls OpenMetadata every 5 minutes and pushes PII column tags as OPA data. New columns tagged PII are masked automatically — no policy redeploy required.
oauth2-proxy ForwardAuth¶
Services without native OIDC (MLflow, Grafana legacy mode, Harbor UI in some modes) are gated via the Traefik ForwardAuth middleware pointing at oauth2-proxy.
sequenceDiagram
participant B as Browser
participant TR as Traefik
participant O2 as oauth2-proxy
participant MLF as MLflow
B->>TR: https://experiments.akko.local
TR->>O2: forward-auth
alt no valid cookie
O2-->>B: redirect to Keycloak
B->>O2: code + state
O2-->>B: set cookie
B->>TR: retry
end
TR->>MLF: pass-through with X-Auth-User header
MLF-->>B: UI
Audit Trail¶
Three correlated log streams land in Grafana for forensic review:
| Source | Format | Examples |
|---|---|---|
| Keycloak events | JSON via keycloak-event-listener-http |
LOGIN, LOGOUT, UPDATE_PASSWORD, TOKEN_REFRESH |
| OPA decision log | JSON via console decision log | {user, input, result, policy} |
| Trino query log | JSON via event-listener |
{queryId, user, sql, bytes_scanned, duration} |
| ADEN audit_log | JSON to Loki | aden_query_received, aden_opa_denied, aden_pii_masked |
See Audit Playbook and the Kubernetes API audit for the full retention and correlation setup.
NetworkPolicies¶
All namespaces enforce NetworkPolicies:
- Egress to the public Internet is denied by default.
- Only
ollama-initgets an allowlist for the Ollama model registry at install time (disabled in air-gapped mode). - Inter-service traffic is allowed only between explicit service pairs (Trino <-> Polaris, Airflow <-> Trino, etc.).