Skip to content

Encryption at rest — runbook

AKKO supports encryption at rest at three layers, each opt-in via a single chart values flip. All three are off by default to keep the demo box simple.

Layer 1 — PostgreSQL column-level (Sprint 46 A2)

pgcrypto is shipped + initialised on the postgresql-data instance. Banking demo PII columns (customers.tax_id, customers.dob, customers.email) are stored encrypted with a per-row symmetric key derived from the AKKO secret.

Already enabled by default — no operator action needed for fresh installs. Existing populated columns are migrated on first read.

Verify :

kubectl -n akko exec deploy/akko-postgresql-data -- \
  psql -U akko -d analytics -c \
  "SELECT data_type FROM information_schema.columns
   WHERE table_name='customers' AND column_name='tax_id';"
# Expected: bytea (encrypted) — not text

Layer 2 — SeaweedFS volume-level (Sprint 52 P1)

When akko-storage.seaweedfs.encryptionAtRest=true, the volume server is launched with -volume.encrypted=true. Each needle (file blob) gets a random AES-GCM per-file key encrypted with the cluster master key.

Activation on a fresh install

helm install akko helm/akko -n akko --create-namespace \
  -f helm/examples/values-domain.yaml \
  -f helm/examples/values-dev-secrets.yaml \
  --set akko-storage.seaweedfs.encryptionAtRest=true

Done. Every Spark/Trino/MLflow/RAG read+write is server-side encrypted from the first byte.

Activation on an existing populated cluster (migration)

The flag does NOT retroactively encrypt files that landed on disk when encryption was off. Pattern :

  1. Stop writers : kubectl scale deploy -l app.kubernetes.io/ component=writer --replicas=0 -n akko. Spark, Airflow DAGs, MLflow logging — all paused.
  2. Provision a second SeaweedFS PVC in the chart with encryptionAtRest=true and a different serviceName (temporary, e.g. akko-seaweedfs-s3-encrypted).
  3. Sync data with the rclone Job we ship at tests/integration/seaweedfs-encrypted-migrate.yaml (TODO Sprint 53).
  4. Update the akko-s3 Secret to point at the new endpoint.
  5. Restart consumers to pick up new endpoint.
  6. Decommission the old volume PVC after a 7-day quarantine.

Verify encryption is active

# Check the weed server CLI flags
kubectl -n akko exec statefulset/akko-seaweedfs -- \
  cat /proc/1/cmdline | tr '\0' '\n' | grep volume.encrypted
# Expected: -volume.encrypted=true

# Check raw bytes on disk are not human-readable
kubectl -n akko exec statefulset/akko-seaweedfs -- \
  hexdump -C /data/topology.toml | head -3
# Encrypted needles show as random bytes, not protobuf signatures

Layer 3 — Kubernetes Secret encryption (operator side)

Out of AKKO's chart scope — managed at the cluster level via the EncryptionConfiguration passed to the kube-apiserver. Most managed k8s services (EKS / GKE / AKS) enable it by default. k3s/k3d does not — see https://kubernetes.io/docs/tasks/administer-cluster/encrypt-data/.

For self-managed clusters, the operator should configure --encryption-provider-config with at least aescbc provider so the etcd backend stores AKKO Secrets encrypted.

Layer 4 — Iceberg metadata + Parquet at rest (Sprint 53 candidate)

Currently the SeaweedFS volume-level encryption (Layer 2) covers Iceberg files transparently — every Parquet block lives inside a SeaweedFS needle that's now encrypted on disk. Native Iceberg column-level encryption (Apache Spec v2) is a separate, more complex topic deferred to Sprint 53+ (key management infrastructure + catalog integration + breaking-change to existing tables).

Compliance mapping

Control AKKO surface
SOC2 CC6.1 (encryption at rest) Layer 2 (SeaweedFS) + Layer 1 (Postgres pgcrypto)
ISO27001 A.10.1.1 Layers 1 + 2 + 3 (operator)
PCI-DSS Req. 3.4 (PAN encryption) Layer 1 pgcrypto on banking columns
GDPR Art. 32 1(a) Layers 1 + 2
HIPAA §164.312(a)(2)(iv) Layers 1 + 2 + 3

Pair with docs/admin/mtls.md (data in transit) for full SOC2 control 6.1 coverage.

Reference

  • helm/akko/charts/akko-storage/values.yamlencryptionAtRest flag
  • helm/akko/charts/akko-postgres/templates/init-pgcrypto.yaml — Sprint 46 A2 init
  • ADR-037 — mTLS (sister-document for data in transit)
  • ADR-032 — image signing + SBOM (supply chain)