akko-charter-runner — in-cluster charter validation¶
akko-charter-runner is a sovereign Kubernetes CronJob that runs the
AKKO charter test suite (campaigns C-10..C-15 + W2/W4/W5) from inside
the cluster, against in-cluster Service URLs (no public ingress, no
Bearer-JWT bypass). It closes the gap left by ADR-057 — public
ingress no Bearer-JWT bypass :
external test runners cannot validate ~25 charter campaigns end-to-end
because the public oauth2-proxy refuses session-less JWTs by design.
The in-cluster runner is the only sec-acceptable way to produce a
green-line that includes those campaigns.
The chart ships disabled by default. Operators flip
akko-charter-runner.enabled: true in their values overlay after the
runner image is in their Harbor registry.
Phase 1 — runner only ; Postgres sink + cockpit UI in Phase 2
The first PR ships the CronJob + minimal RBAC + NetworkPolicy.
Run results live in kubectl logs job/... and the JUnit XML the
runner writes to /tmp/charter-out. Phase 2 adds a Postgres sink
and a cockpit "last charter run" widget so the green-line is
visible without kubectl.
When to enable it¶
Enable it when all three are true :
- The platform is past initial install and stable enough that pytest surfaces real regressions, not bootstrap noise.
- Your Harbor registry has the
akko-charter-runner:2026.05image (built byhelm/scripts/build-images.sh). - You want a daily attestation that the charter campaigns are green end-to-end, including the campaigns that external runners can't reach (C-11 / C-15 / W2 / W4 / W5).
Otherwise leave it enabled: false. The cockpit and ADEN don't read
its output ; nothing else in the platform depends on it being on.
Architecture¶
flowchart LR
Cron[CronJob<br/>04:00 UTC daily]
Job[akko-charter-runner Job pod]
CB[akko-cockpit-backend:8080]
T[akko-trino:8080]
KC[akko-keycloak:8080]
OPA[akko-opa:8181]
PS[akko-policy-sync:8080]
Cron --> Job
Job -->|in-cluster Service URL| CB
Job -->|in-cluster Service URL| T
Job -->|in-cluster Service URL| KC
Job -->|in-cluster Service URL| OPA
Job -->|in-cluster Service URL| PS
Job -->|stdout + JUnit XML| Logs[kubectl logs job/...]
The runner never reaches the public ingress (trino.<domain>,
federation.<domain>, demo.<domain>). It uses the Service DNS
names that resolve only inside the cluster — akko-akko-trino:8080,
etc. — so a leaked runner ServiceAccount token is useless from
outside the cluster.
Configuration¶
The umbrella values.yaml exposes akko-charter-runner. Minimal viable
overlay :
akko-charter-runner:
enabled: true
schedule: "0 4 * * *" # 04:00 UTC daily, set "" to disable
suites: "charter" # or "all" to widen
inClusterTargets:
cockpitBackend: http://akko-akko-cockpit-backend:8080
trino: http://akko-akko-trino:8080
keycloak: http://akko-akko-keycloak:8080
opa: http://akko-akko-opa:8181
policySync: http://akko-akko-policy-sync:8080
personaSecret:
name: akko-test-passwords
optional: true
Replace akko-akko- with your release name when not running the
default helm install akko ....
Trigger a manual run¶
# One-shot job from the CronJob template :
kubectl -n akko create job --from=cronjob/akko-akko-charter-runner \
manual-$(date +%Y%m%d-%H%M%S)
# Then watch the logs :
kubectl -n akko logs -f job/manual-...
# The summary line at the end gives the count :
# [charter-runner] SUMMARY tests=42 pass=37 fail=0 err=0 skip=5
Security posture¶
| Vector | Mitigation |
|---|---|
| Privilege escalation | Namespace-scoped Role only, no cluster-scoped resources, no wildcard verbs, no pods/exec. |
| Arbitrary Secret read | resourceNames: lock — only the persona Secret named in personaSecret.name is readable. |
| Egress to internet | NetworkPolicy egress allow-list — DNS, in-cluster AKKO services, kube-apiserver only. No 0.0.0.0/0. |
| Public ingress bypass | The runner never uses public Service URLs ; it hits in-cluster Service DNS by design. ADR-057 contract intact. |
| Runtime tampering | runAsNonRoot, allowPrivilegeEscalation: false, capabilities.drop: [ALL]. Image signed via cosign (Kyverno verify policy). |
| Concurrency abuse | concurrencyPolicy: Forbid — a long run can't be raced by the next CronJob tick. |
The trust model is the same as a long-lived service in the namespace, not an admin agent. A compromised runner can read the AKKO Service ClusterIPs and the persona Secret ; it cannot escalate, cannot exec into other pods, cannot reach the internet.
Operating¶
Where to find run output¶
- Live tail :
kubectl -n akko logs -f job/<name> - JUnit XML (for CI ingestion) : the runner writes
/tmp/charter-out/charter-<host>-<ts>.xmlinside the pod. Usekubectl cpto extract it during the brief window before the Job completes (Phase 2 publishes this to Postgres automatically). - Summary line : the last log line is
[charter-runner] SUMMARY tests=N pass=N fail=N err=N skip=N.
Common failure modes¶
| Symptom | Likely cause | Fix |
|---|---|---|
kubectl logs shows connect-refused for akko-akko-trino:8080 |
Query engine (Trino) is not yet enabled in this deployment | flip trino.enabled: true OR set inClusterTargets.trino to the actual Service DNS |
Most tests SKIP with AKKO_TEST_PASSWORD_ALICE missing |
persona Secret not mounted | apply the demo Secret from akko-init OR set personaSecret.optional: false to fail loudly |
All tests FAIL with connect-refused for kube-API |
Sprint 73 NP regression | check kubectl get networkpolicy -n akko -l app.kubernetes.io/name=akko-charter-runner |
Runner pod itself stuck Pending |
resources too high for the cluster | shrink resources.requests ; defaults are 200m / 512Mi |