Skip to content

akko-charter-runner — in-cluster charter validation

akko-charter-runner is a sovereign Kubernetes CronJob that runs the AKKO charter test suite (campaigns C-10..C-15 + W2/W4/W5) from inside the cluster, against in-cluster Service URLs (no public ingress, no Bearer-JWT bypass). It closes the gap left by ADR-057 — public ingress no Bearer-JWT bypass : external test runners cannot validate ~25 charter campaigns end-to-end because the public oauth2-proxy refuses session-less JWTs by design. The in-cluster runner is the only sec-acceptable way to produce a green-line that includes those campaigns.

The chart ships disabled by default. Operators flip akko-charter-runner.enabled: true in their values overlay after the runner image is in their Harbor registry.

Phase 1 — runner only ; Postgres sink + cockpit UI in Phase 2

The first PR ships the CronJob + minimal RBAC + NetworkPolicy. Run results live in kubectl logs job/... and the JUnit XML the runner writes to /tmp/charter-out. Phase 2 adds a Postgres sink and a cockpit "last charter run" widget so the green-line is visible without kubectl.

When to enable it

Enable it when all three are true :

  1. The platform is past initial install and stable enough that pytest surfaces real regressions, not bootstrap noise.
  2. Your Harbor registry has the akko-charter-runner:2026.05 image (built by helm/scripts/build-images.sh).
  3. You want a daily attestation that the charter campaigns are green end-to-end, including the campaigns that external runners can't reach (C-11 / C-15 / W2 / W4 / W5).

Otherwise leave it enabled: false. The cockpit and ADEN don't read its output ; nothing else in the platform depends on it being on.

Architecture

flowchart LR
  Cron[CronJob<br/>04:00 UTC daily]
  Job[akko-charter-runner Job pod]
  CB[akko-cockpit-backend:8080]
  T[akko-trino:8080]
  KC[akko-keycloak:8080]
  OPA[akko-opa:8181]
  PS[akko-policy-sync:8080]
  Cron --> Job
  Job -->|in-cluster Service URL| CB
  Job -->|in-cluster Service URL| T
  Job -->|in-cluster Service URL| KC
  Job -->|in-cluster Service URL| OPA
  Job -->|in-cluster Service URL| PS
  Job -->|stdout + JUnit XML| Logs[kubectl logs job/...]

The runner never reaches the public ingress (trino.<domain>, federation.<domain>, demo.<domain>). It uses the Service DNS names that resolve only inside the cluster — akko-akko-trino:8080, etc. — so a leaked runner ServiceAccount token is useless from outside the cluster.

Configuration

The umbrella values.yaml exposes akko-charter-runner. Minimal viable overlay :

akko-charter-runner:
  enabled: true
  schedule: "0 4 * * *"            # 04:00 UTC daily, set "" to disable
  suites: "charter"                # or "all" to widen
  inClusterTargets:
    cockpitBackend: http://akko-akko-cockpit-backend:8080
    trino:          http://akko-akko-trino:8080
    keycloak:       http://akko-akko-keycloak:8080
    opa:            http://akko-akko-opa:8181
    policySync:     http://akko-akko-policy-sync:8080
  personaSecret:
    name: akko-test-passwords
    optional: true

Replace akko-akko- with your release name when not running the default helm install akko ....

Trigger a manual run

# One-shot job from the CronJob template :
kubectl -n akko create job --from=cronjob/akko-akko-charter-runner \
        manual-$(date +%Y%m%d-%H%M%S)

# Then watch the logs :
kubectl -n akko logs -f job/manual-... 

# The summary line at the end gives the count :
#   [charter-runner] SUMMARY tests=42 pass=37 fail=0 err=0 skip=5

Security posture

Vector Mitigation
Privilege escalation Namespace-scoped Role only, no cluster-scoped resources, no wildcard verbs, no pods/exec.
Arbitrary Secret read resourceNames: lock — only the persona Secret named in personaSecret.name is readable.
Egress to internet NetworkPolicy egress allow-list — DNS, in-cluster AKKO services, kube-apiserver only. No 0.0.0.0/0.
Public ingress bypass The runner never uses public Service URLs ; it hits in-cluster Service DNS by design. ADR-057 contract intact.
Runtime tampering runAsNonRoot, allowPrivilegeEscalation: false, capabilities.drop: [ALL]. Image signed via cosign (Kyverno verify policy).
Concurrency abuse concurrencyPolicy: Forbid — a long run can't be raced by the next CronJob tick.

The trust model is the same as a long-lived service in the namespace, not an admin agent. A compromised runner can read the AKKO Service ClusterIPs and the persona Secret ; it cannot escalate, cannot exec into other pods, cannot reach the internet.

Operating

Where to find run output

  • Live tail : kubectl -n akko logs -f job/<name>
  • JUnit XML (for CI ingestion) : the runner writes /tmp/charter-out/charter-<host>-<ts>.xml inside the pod. Use kubectl cp to extract it during the brief window before the Job completes (Phase 2 publishes this to Postgres automatically).
  • Summary line : the last log line is [charter-runner] SUMMARY tests=N pass=N fail=N err=N skip=N.

Common failure modes

Symptom Likely cause Fix
kubectl logs shows connect-refused for akko-akko-trino:8080 Query engine (Trino) is not yet enabled in this deployment flip trino.enabled: true OR set inClusterTargets.trino to the actual Service DNS
Most tests SKIP with AKKO_TEST_PASSWORD_ALICE missing persona Secret not mounted apply the demo Secret from akko-init OR set personaSecret.optional: false to fail loudly
All tests FAIL with connect-refused for kube-API Sprint 73 NP regression check kubectl get networkpolicy -n akko -l app.kubernetes.io/name=akko-charter-runner
Runner pod itself stuck Pending resources too high for the cluster shrink resources.requests ; defaults are 200m / 512Mi