Skip to content

Dashboards

AKKO embeds a first-party dashboarding layer so every metric the platform emits has a curated visualisation out of the box. The deployment lives in the akko-observability sub-chart and runs in the same akko namespace as the rest of the platform.

The layer is exposed at https://metrics.akko-ai.com (or the value of global.functionalAliases.metrics in your overrides) — gated by oauth2-proxy + Keycloak SSO.

The cockpit Overview iframe pulls the cluster-overview dashboard directly from this endpoint, so the chart bootstraps a default project + datasource + dashboards on every helm upgrade.


What ships out of the box

The post-install Job loads every JSON file from helm/akko/charts/akko-observability/dashboards/*.json into the akko project. Twelve dashboards :

Name Story What it shows
cluster-overview Platform health (cockpit Overview) Pods ready, ingress 2xx, restarts, node memory
aden-slo ADEN AI service Query latency p50/p95, success rate, LLM tokens
storage-layer Object storage Volume server stats
trino Federation flagship Running queries, failures/min, latency p50/p95, container CPU/mem
audit-trail DORA / NIS2 / GDPR OPA decisions, Trino queries by user, login activity, ADEN questions
litellm AI gateway Requests/sec, tokens/5min, errors/min, latency p95, models
pipelines Orchestration Run status, task duration p95, scheduler heartbeat, top failing pipelines
platform-slo Top-level summary Pods ready, ingress 2xx, 5xx by service, ingress p95 by service
trino-slo Federation SLO Success rate, queued queries, p99 latency, memory headroom, queries by catalog
mlflow ML tracking Runs/h, registered models, model versions, tracking HTTP rate, pod CPU/mem
jupyterhub IDE pods Active users, running notebooks, spawn p95, per-pod memory, hub HTTP rate
trino-ai-plugin Plugin internals AI HTTP requests/errors, cache hit rate, circuit breaker, latency by function

Every panel falls back to vector(0) when its underlying metric is absent — an idle deployment shows a clean baseline rather than "No data".


Add your own dashboard

Drop a JSON file alongside the existing ones. The bootstrap Job picks it up automatically on the next helm upgrade.

# helm/akko/charts/akko-observability/dashboards/13-my-team.json
{
  "kind": "Dashboard",
  "metadata": { "name": "my-team", "project": "akko" },
  "spec": {
    "display": { "name": "My team — KPIs" },
    "duration": "1h",
    "refreshInterval": "30s",
    "panels": {
      "rps": {
        "kind": "Panel",
        "spec": {
          "display": { "name": "Requests / sec" },
          "plugin": { "kind": "TimeSeriesChart", "spec": {} },
          "queries": [{
            "kind": "TimeSeriesQuery",
            "spec": { "plugin": { "kind": "PrometheusTimeSeriesQuery", "spec": {
              "datasource": { "kind": "PrometheusDatasource", "name": "prometheus" },
              "query": "sum(rate(http_requests_total{namespace=\"my-team\"}[5m]))"
            } } }
          }]
        }
      }
    },
    "layouts": [{ "kind": "Grid", "spec": { "items": [
      { "x": 0, "y": 0, "width": 24, "height": 8, "content": { "$ref": "#/spec/panels/rps" } }
    ] } }]
  }
}

The default prometheus datasource is already wired (HTTPProxy mode), so any new dashboard can simply reference {"name": "prometheus", "kind": "PrometheusDatasource"}.


Smoke test

To verify the bootstrap succeeded without opening a browser :

kubectl -n akko exec deploy/akko-akko-cockpit -- \
  curl -s http://akko-dashboards:8080/api/v1/projects/akko/dashboards \
  | jq -r '.[].metadata.name' | sort

Expected output (one name per line, in any order) :

aden-slo
audit-trail
cluster-overview
jupyterhub
litellm
mlflow
pipelines
platform-slo
storage-layer
trino
trino-ai-plugin
trino-slo

If the count is below 12, the bootstrap Job logs (kubectl -n akko logs job/akko-dashboards-bootstrap) will show which file failed to POST.


Why the layer is swappable

The platform does not depend on a specific dashboarding engine. The only contract is :

  • The engine speaks PromQL natively.
  • It exposes a REST API to POST a dashboard JSON.
  • It supports a public iframe behind oauth2-proxy.

That contract is satisfied by several Apache 2.0 / MIT dashboarding projects, so the engine can be swapped without rewriting any dashboard JSON or moving any metric. The 12 dashboards above are the contract the platform commits to ; the binary that renders them is an implementation detail.