akko-policy-sync — PII auto-propagation¶
akko-policy-sync closes the loop between data classification and
access enforcement. A data steward classifies a column as
PII.Sensitive in Catalog. The daemon picks up the new tag on
its next sync cycle, computes the matching column masks for every
non-admin subject whose data scope covers the table, and PATCHes the
Policy engine bundle ConfigMap. Query engine reloads the bundle and starts redacting
the column for analyst queries — typically within 30 seconds, or
under 5 seconds when the steward triggers a manual sync from the
cockpit.
The daemon is off by default. Flip enabled: true only after
provisioning a Catalog bot JWT.
Sprint 67 PR-D / ADR-055 — V1 timer + manual trigger
V1 ships the periodic scheduler (30 min), the manual trigger, the cockpit panel and OCSF 3007 audit. V2 will add the OM EventListener webhook once customer feedback shows the 30-min window is too long. See ADR-055.
When to use it¶
Turn it on when all four are true :
- Catalog is the source of truth for column-level PII tagging in your deployment.
- The Policy engine bundle holds your row + column masking policies (default AKKO topology — Sprint 67 PR-C).
- You want classification work to drive enforcement automatically, without manual policy edits.
- You can mint an OM bot user with read access to
/api/v1/tables+/api/v1/classifications.
If any of those is false, leave enabled: false. Stewards keep
editing column masks manually through the Data Access cockpit page
(Sprint 67 PR-C wizard) — that loop already works on its own.
Architecture¶
flowchart LR
subgraph Triggers
T1[CronJob 30 min]
T2[POST /api/v1/sync<br/>cockpit panel]
end
T1 --> SVC
T2 --> SVC
SVC[akko-policy-sync<br/>FastAPI + asyncio]
SVC -->|GET tags<br/>Bearer JWT| OM[OpenMetadata<br/>:8585]
SVC -->|PATCH ConfigMap<br/>If-Match resourceVersion| OPA[OPA bundle<br/>ConfigMap]
SVC -->|HTTP /insert/jsonline<br/>OCSF 3007| VL[VictoriaLogs<br/>:9428]
OPA --> Trino[Trino enforcement<br/>SHA-256 redaction]
Configuration¶
The umbrella values.yaml exposes akko-policy-sync. The minimum
viable overlay is :
akko-policy-sync:
enabled: true
openmetadata:
baseUrl: http://akko-openmetadata-server:8585
jwtSecret:
name: akko-openmetadata-bot-jwt
key: token
piiTagToMask:
PII.Sensitive: sha256
PII.NonSensitive: identity
adminSubjects:
- akko-admin
- AD_admin
syncIntervalSeconds: 1800
akko-cockpit-backend:
policySync:
url: http://akko-akko-policy-sync:8080
Replace akko with your release name if it differs.
Bot JWT — minting a Catalog token¶
# 1. Open the OM admin UI as a user with `Settings` / `Bots` access.
# 2. Create a new bot, e.g. `akko-policy-sync`.
# 3. Generate a JWT with role `DataStewardRole` (read tables/tags) +
# a 1-year expiry.
# 4. Persist it in the namespace AKKO runs in :
kubectl -n akko create secret generic akko-openmetadata-bot-jwt \
--from-literal=token="<paste-jwt-here>"
When the token rotates, update the Secret. The daemon re-reads the file every sync cycle ; no restart required.
Tuning the tag → mask map¶
piiTagToMask is the only product knob. Whatever OM tag you put
on the left maps to whatever mask name you put on the right. The
mask name must exist in the Policy engine mask_library (default ships
sha256, md5, null, redact, identity, partial_last4).
piiTagToMask:
PII.Sensitive: sha256
PII.NonSensitive: identity
PCI.PAN: partial_last4
Compliance.HealthRecord: null
A tag absent from the map produces no auto-propagation — the column passes through untouched.
Exempting subjects¶
Subjects in adminSubjects are skipped : the propagator never
hashes their queries. Default = [akko-admin, AD_admin]. Add any
subject that must always see plaintext (auditors, on-call DBA,
break-glass accounts).
Operating¶
Running a sync from the cockpit¶
Open https://demo.<your-domain>/#data-access, scroll down to PII
Auto-Propagation, click Run sync now. The grid below the
toolbar updates with the run summary :
- last_run timestamp + duration
- tables_processed
- subjects_affected
- publish_failures (should stay at 0)
- next scheduled run
The trigger is gated by akko-admin and emits an OCSF 3007 event
to Logs layer (VictoriaLogs) (activity_id = OTHER, target =
service:policy-sync). The Audit tab lists every manual run
alongside the human-driven Data Access edits.
Inspecting from kubectl¶
kubectl -n akko port-forward svc/akko-akko-policy-sync 8080:8080 &
curl -s localhost:8080/api/v1/status | jq .
curl -s localhost:8080/api/v1/config | jq .
curl -X POST localhost:8080/api/v1/sync | jq .
Metrics layer (Prometheus)¶
akko_policy_sync_runs_total{outcome="success"}
akko_policy_sync_runs_total{outcome="failure"}
akko_policy_sync_tables_processed_total
akko_policy_sync_subjects_affected_total
akko_policy_sync_publish_failures_total
akko_policy_sync_last_run_timestamp_seconds
A non-zero publish_failures_total for two consecutive runs should
page on-call. The most common root causes :
| Cause | Symptom | Fix |
|---|---|---|
| OM bot JWT expired | All runs fail with HTTP 401 from OM | Rotate the token + update the Secret |
| OM unreachable | connect-refused in pod logs |
Check the OM Service + NetworkPolicy egress on 8585 |
| Policy engine configmap drift | HTTP 409 on PATCH | A second writer is editing the bundle ; check the Data Access editor logs |
pii_tag_to_mask references a missing mask |
Policy engine reload silently keeps the old bundle | Add the mask to the Policy engine mask_library overlay |
Manual masks vs auto masks¶
The daemon tags every entry it writes with source: "auto-pii".
On every sync cycle, only entries with that source are replaced.
Manual masks added through the cockpit Data Access wizard (Sprint
67 PR-C) carry no source field and are pure pass-through. Stewards
can mix the two freely : a manual partial_last4 mask on
customers.phone co-exists with an auto-propagated sha256 mask
on customers.email with no contention.
Audit trail¶
Every sync run produces one OCSF 3007 event with :
class_uid= 3007 (Data Access Management)activity_id= 99 (OTHER)actor.user.name=system:policy-syncfor periodic runs, the admin'spreferred_usernamefor manual triggersmetadata.product.name=akko-policy-syncunmapped.tables_processed,unmapped.subjects_affected,unmapped.publish_failuresresources= the full list of subjects whose policy was touched
Query in Logs layer :
Or open the cockpit Audit tab — entries appear in the same stream as human-driven edits.
Security¶
| Vector | Mitigation |
|---|---|
| Privilege escalation | RoleBinding scoped to a single ConfigMap by resourceNames, verbs limited to get / patch / update (no list / create / delete / *) |
| Tag-injection from a compromised OM | The propagator can only set masks present in mask_library ; a rogue tag mapped to no mask is a no-op |
| Network exfil | NetworkPolicy egress allow-list : OM 8585, VL 9428, kube API 443 + 6443 (kube-router post-DNAT) ; everything else denied |
| Audit bypass | Every run emits OCSF 3007 ; manual triggers carry the calling admin's sub |
| Bot JWT theft | Token sits in a namespace-scoped Secret ; rotate every 12 months |