Skip to content

Security Flow

AKKO enforces a 4-layer defence that flows a user's identity from the browser through the data plane:

  1. Authentication — Keycloak SSO (single source of truth, 13 OAuth clients, 5 roles).
  2. Authorization — OPA (fine-grained, data-aware policies synced from Keycloak groups).
  3. Query enforcement — Trino row-filters + column-masks via the OPA plugin.
  4. Audit — structured logs (Loki) + Keycloak events + Trino query log + OPA decision log.

End-to-End Auth Flow

sequenceDiagram
    participant B as Browser
    participant T as Traefik (TLS)
    participant O2P as oauth2-proxy
    participant S as Service (Superset / JupyterHub)
    participant KC as Keycloak
    participant OPA as OPA
    participant TR as Trino
    participant OM as OpenMetadata

    B->>T: https://bi.akko.local
    T->>O2P: ForwardAuth check (services w/o native OIDC)
    O2P->>KC: redirect to /auth (OIDC)
    B->>KC: login (user + password / MFA)
    KC-->>B: ID token + access token (with roles claim)
    B->>S: request with bearer token
    S->>KC: validate JWT (JWKS)
    KC-->>S: ok
    S->>TR: SQL with X-Trino-User + token
    TR->>OPA: POST /v1/data/trino/allow {user, roles, query, table, columns}
    OPA->>KC: pull group memberships (synced cache)
    OPA-->>TR: allow + row_filters + column_masks
    TR->>OM: resolve PII tags (cached)
    TR->>TR: rewrite query (apply filters + masks)
    TR-->>S: masked rows
    S-->>B: chart / table
    note over OPA,KC: OPA poll every 60s for Keycloak group sync

Roles and Groups

flowchart LR
    subgraph KC[Keycloak realm akko]
        R1[akko-admin]
        R2[akko-engineer]
        R3[akko-analyst]
        R4[akko-user]
        R5[akko-viewer]
        G1[group: data-team]
        G2[group: finance]
        G3[group: svc-accounts]
    end
    subgraph LLDAP[LLDAP optional]
        L1[corporate users]
        L2[corporate groups]
    end
    LLDAP -.federated.-> KC
    R1 -->|maps| OPA
    R2 -->|maps| OPA
    R3 -->|maps| OPA
    R4 -->|maps| OPA
    R5 -->|maps| OPA
    G1 -->|project access| OPA
    G2 -->|project access| OPA
    G3 -->|machine tokens| OPA

Trino Row-Filter + Column-Mask

The OPA plugin rewrites every query before execution.

# package trino — excerpt
mask_email[columns] {
    columns = {"email", "phone", "ssn", "dob"}
    not roles[_] in {"akko-admin", "akko-engineer", "akko-analyst"}
}

row_filter[{"filter": f}] {
    input.context.identity.groups[_] == "finance"
    f := "region IN ('EU', 'APAC')"
}

Example: a user in role akko-user issues:

SELECT email, amount FROM iceberg.banking.transactions;

Trino rewrites this to:

SELECT '***masked***' AS email, amount
FROM iceberg.banking.transactions
WHERE region IN ('EU', 'APAC');

PII Tag Sync — OpenMetadata -> OPA

flowchart LR
    OM[OpenMetadata<br/>column tags: PII, GDPR.Personal] -->|nightly ingest| REG[tag cache]
    REG -->|/policies/pii| OPA
    OPA -->|column_mask| TR[Trino]

The opa-sync sidecar polls OpenMetadata every 5 minutes and pushes PII column tags as OPA data. New columns tagged PII are masked automatically — no policy redeploy required.

oauth2-proxy ForwardAuth

Services without native OIDC (MLflow, Grafana legacy mode, Harbor UI in some modes) are gated via the Traefik ForwardAuth middleware pointing at oauth2-proxy.

sequenceDiagram
    participant B as Browser
    participant TR as Traefik
    participant O2 as oauth2-proxy
    participant MLF as MLflow

    B->>TR: https://experiments.akko.local
    TR->>O2: forward-auth
    alt no valid cookie
        O2-->>B: redirect to Keycloak
        B->>O2: code + state
        O2-->>B: set cookie
        B->>TR: retry
    end
    TR->>MLF: pass-through with X-Auth-User header
    MLF-->>B: UI

Audit Trail

Three correlated log streams land in Grafana for forensic review:

Source Format Examples
Keycloak events JSON via keycloak-event-listener-http LOGIN, LOGOUT, UPDATE_PASSWORD, TOKEN_REFRESH
OPA decision log JSON via console decision log {user, input, result, policy}
Trino query log JSON via event-listener {queryId, user, sql, bytes_scanned, duration}
ADEN audit_log JSON to Loki aden_query_received, aden_opa_denied, aden_pii_masked

See Audit Playbook and the Kubernetes API audit for the full retention and correlation setup.

NetworkPolicies

All namespaces enforce NetworkPolicies:

  • Egress to the public Internet is denied by default.
  • Only ollama-init gets an allowlist for the Ollama model registry at install time (disabled in air-gapped mode).
  • Inter-service traffic is allowed only between explicit service pairs (Trino <-> Polaris, Airflow <-> Trino, etc.).