Skip to content

akko-policy-sync — PII auto-propagation

akko-policy-sync closes the loop between data classification and access enforcement. A data steward classifies a column as PII.Sensitive in Catalog. The daemon picks up the new tag on its next sync cycle, computes the matching column masks for every non-admin subject whose data scope covers the table, and PATCHes the Policy engine bundle ConfigMap. Query engine reloads the bundle and starts redacting the column for analyst queries — typically within 30 seconds, or under 5 seconds when the steward triggers a manual sync from the cockpit.

The daemon is off by default. Flip enabled: true only after provisioning a Catalog bot JWT.

Sprint 67 PR-D / ADR-055 — V1 timer + manual trigger

V1 ships the periodic scheduler (30 min), the manual trigger, the cockpit panel and OCSF 3007 audit. V2 will add the OM EventListener webhook once customer feedback shows the 30-min window is too long. See ADR-055.

When to use it

Turn it on when all four are true :

  1. Catalog is the source of truth for column-level PII tagging in your deployment.
  2. The Policy engine bundle holds your row + column masking policies (default AKKO topology — Sprint 67 PR-C).
  3. You want classification work to drive enforcement automatically, without manual policy edits.
  4. You can mint an OM bot user with read access to /api/v1/tables + /api/v1/classifications.

If any of those is false, leave enabled: false. Stewards keep editing column masks manually through the Data Access cockpit page (Sprint 67 PR-C wizard) — that loop already works on its own.

Architecture

flowchart LR
  subgraph Triggers
    T1[CronJob 30 min]
    T2[POST /api/v1/sync<br/>cockpit panel]
  end
  T1 --> SVC
  T2 --> SVC
  SVC[akko-policy-sync<br/>FastAPI + asyncio]
  SVC -->|GET tags<br/>Bearer JWT| OM[OpenMetadata<br/>:8585]
  SVC -->|PATCH ConfigMap<br/>If-Match resourceVersion| OPA[OPA bundle<br/>ConfigMap]
  SVC -->|HTTP /insert/jsonline<br/>OCSF 3007| VL[VictoriaLogs<br/>:9428]
  OPA --> Trino[Trino enforcement<br/>SHA-256 redaction]

Configuration

The umbrella values.yaml exposes akko-policy-sync. The minimum viable overlay is :

akko-policy-sync:
  enabled: true
  openmetadata:
    baseUrl: http://akko-openmetadata-server:8585
    jwtSecret:
      name: akko-openmetadata-bot-jwt
      key: token
  piiTagToMask:
    PII.Sensitive: sha256
    PII.NonSensitive: identity
  adminSubjects:
    - akko-admin
    - AD_admin
  syncIntervalSeconds: 1800

akko-cockpit-backend:
  policySync:
    url: http://akko-akko-policy-sync:8080

Replace akko with your release name if it differs.

Bot JWT — minting a Catalog token

# 1. Open the OM admin UI as a user with `Settings` / `Bots` access.
# 2. Create a new bot, e.g. `akko-policy-sync`.
# 3. Generate a JWT with role `DataStewardRole` (read tables/tags) +
#    a 1-year expiry.
# 4. Persist it in the namespace AKKO runs in :
kubectl -n akko create secret generic akko-openmetadata-bot-jwt \
  --from-literal=token="<paste-jwt-here>"

When the token rotates, update the Secret. The daemon re-reads the file every sync cycle ; no restart required.

Tuning the tag → mask map

piiTagToMask is the only product knob. Whatever OM tag you put on the left maps to whatever mask name you put on the right. The mask name must exist in the Policy engine mask_library (default ships sha256, md5, null, redact, identity, partial_last4).

piiTagToMask:
  PII.Sensitive:           sha256
  PII.NonSensitive:        identity
  PCI.PAN:                 partial_last4
  Compliance.HealthRecord: null

A tag absent from the map produces no auto-propagation — the column passes through untouched.

Exempting subjects

Subjects in adminSubjects are skipped : the propagator never hashes their queries. Default = [akko-admin, AD_admin]. Add any subject that must always see plaintext (auditors, on-call DBA, break-glass accounts).

Operating

Running a sync from the cockpit

Open https://demo.<your-domain>/#data-access, scroll down to PII Auto-Propagation, click Run sync now. The grid below the toolbar updates with the run summary :

  • last_run timestamp + duration
  • tables_processed
  • subjects_affected
  • publish_failures (should stay at 0)
  • next scheduled run

The trigger is gated by akko-admin and emits an OCSF 3007 event to Logs layer (VictoriaLogs) (activity_id = OTHER, target = service:policy-sync). The Audit tab lists every manual run alongside the human-driven Data Access edits.

Inspecting from kubectl

kubectl -n akko port-forward svc/akko-akko-policy-sync 8080:8080 &
curl -s localhost:8080/api/v1/status   | jq .
curl -s localhost:8080/api/v1/config   | jq .
curl -X POST localhost:8080/api/v1/sync | jq .

Metrics layer (Prometheus)

akko_policy_sync_runs_total{outcome="success"}
akko_policy_sync_runs_total{outcome="failure"}
akko_policy_sync_tables_processed_total
akko_policy_sync_subjects_affected_total
akko_policy_sync_publish_failures_total
akko_policy_sync_last_run_timestamp_seconds

A non-zero publish_failures_total for two consecutive runs should page on-call. The most common root causes :

Cause Symptom Fix
OM bot JWT expired All runs fail with HTTP 401 from OM Rotate the token + update the Secret
OM unreachable connect-refused in pod logs Check the OM Service + NetworkPolicy egress on 8585
Policy engine configmap drift HTTP 409 on PATCH A second writer is editing the bundle ; check the Data Access editor logs
pii_tag_to_mask references a missing mask Policy engine reload silently keeps the old bundle Add the mask to the Policy engine mask_library overlay

Manual masks vs auto masks

The daemon tags every entry it writes with source: "auto-pii". On every sync cycle, only entries with that source are replaced. Manual masks added through the cockpit Data Access wizard (Sprint 67 PR-C) carry no source field and are pure pass-through. Stewards can mix the two freely : a manual partial_last4 mask on customers.phone co-exists with an auto-propagated sha256 mask on customers.email with no contention.

Audit trail

Every sync run produces one OCSF 3007 event with :

  • class_uid = 3007 (Data Access Management)
  • activity_id = 99 (OTHER)
  • actor.user.name = system:policy-sync for periodic runs, the admin's preferred_username for manual triggers
  • metadata.product.name = akko-policy-sync
  • unmapped.tables_processed, unmapped.subjects_affected, unmapped.publish_failures
  • resources = the full list of subjects whose policy was touched

Query in Logs layer :

_msg:AKKO_AUDIT class_uid:3007 metadata.product.name:akko-policy-sync

Or open the cockpit Audit tab — entries appear in the same stream as human-driven edits.

Security

Vector Mitigation
Privilege escalation RoleBinding scoped to a single ConfigMap by resourceNames, verbs limited to get / patch / update (no list / create / delete / *)
Tag-injection from a compromised OM The propagator can only set masks present in mask_library ; a rogue tag mapped to no mask is a no-op
Network exfil NetworkPolicy egress allow-list : OM 8585, VL 9428, kube API 443 + 6443 (kube-router post-DNAT) ; everything else denied
Audit bypass Every run emits OCSF 3007 ; manual triggers carry the calling admin's sub
Bot JWT theft Token sits in a namespace-scoped Secret ; rotate every 12 months