AKKO Control Plane¶
AKKO integrates ~42 components (storage, catalog, compute, orchestration, BI, AI, RBAC). Exposing each one's native UI made the platform read like "a stack of open-source bricks" rather than "one product". The Control Plane fixes that: a thin FastAPI service that owns the product abstractions and proxies downstream to the right component.
Principle — the cockpit and external clients only talk to
/api/v1/*. Each native component keeps its URL for direct operator use, but day-to-day product flows never need it.
What it exposes¶
| Abstraction | Endpoint | What's behind |
|---|---|---|
| Dataset | GET /api/v1/datasets |
OpenMetadata tables + OPA allow/deny + Trino FQN |
| Workspace | GET /api/v1/workspacesPOST /api/v1/workspaces |
JupyterHub single-user server (spawn / stop / describe) |
| Pipeline | GET /api/v1/pipelines |
Airflow DAGs + DagRun history (trigger coming Phase 1.5) |
| Agent | POST /api/v1/agents/ask |
Forwards to ADEN with caller identity |
OpenAPI spec: /api/openapi.json · Swagger UI: /api/docs.
Architecture¶
flowchart LR
Client[Cockpit React / CLI / SDK] --> Traefik[Traefik + oauth2-proxy]
Traefik -->|X-Auth-Request-* headers| CP[akko-control-plane]
CP --> OM[OpenMetadata]
CP --> JH[JupyterHub]
CP --> AF[Airflow 3]
CP --> AD[ADEN]
CP --> OPA[OPA]
CP --> TR[Trino]
The Control Plane is stateless: no database, no background jobs. Identity comes from the X-Auth-Request-Email and X-Auth-Request-Groups headers injected by oauth2-proxy forward-auth. A NetworkPolicy limits ingress to cockpit + Traefik pods, so the headers cannot be forged by other workloads in the namespace.
What it is NOT¶
- Not a god service. No business logic duplicated from downstream components. If a call can be a 1-line proxy, it is. The role of the Control Plane is translation, not re-implementation.
- Not on the hot path. Trino query execution, JupyterHub WebSocket kernel traffic, Airflow task logs stream directly from their native ingress. The Control Plane is for product intent (list / create / describe), not sub-second data flow.
- Not the auth layer. oauth2-proxy + Keycloak do the JWT validation; the Control Plane only reads the trusted headers that land after.
Role of each abstraction¶
Dataset¶
A Dataset is the product-level view of a governed data asset. Each Dataset record composes:
- the OpenMetadata entity (name, owner, description, tags, domain)
- the Trino FQN the caller can paste into any notebook or BI client (
catalog.schema.table) - the OPA allow/deny flag for the caller (arriving in Phase 1.2)
- the PII / classification tags from OpenMetadata
The cockpit never shows iceberg.analytics.transactions as a raw path — it shows "Transactions dataset, owner @carol, PII tagged, you can read".
Workspace¶
A Workspace is the user's isolated working area. Behind the scenes it composes:
- a JupyterHub single-user server (the notebook compute)
- a Trino session scoped to the caller's identity (so OPA row-filter + column-mask apply)
- a MinIO prefix under
akko-users/<user>/for artefacts - metadata (name, description, created_at)
POST /api/v1/workspaces returns 501 until the JupyterHub admin-token wiring ships (Phase 1.4b). Read-only endpoints already work.
Pipeline¶
A Pipeline projects an Airflow DAG as a business pipeline:
- the DAG run status (healthy / stale / failing)
- the OpenLineage graph (upstream / downstream Datasets)
- the last N DagRun states — at a glance, not one click into Airflow
Users never learn the word "DAG". Trigger / pause / resume land in Phase 1.5b.
Agent¶
Today, a single Agent: ADEN. POST /api/v1/agents/ask forwards a natural-language question to ADEN with the caller headers injected so ADEN's per-user OPA decisions see the right principal. Multiple Agents (SQL duel validator, auto-doc, auto-remediation) plug in here in Phase 3 without breaking the cockpit contract.
Deployment¶
Part of the umbrella chart when enabled. Ships as its own sub-chart:
# helm/akko/charts/akko-control-plane/values.yaml
image:
repository: akko-control-plane
tag: "2026.04"
env:
openmetadataUrl: "http://openmetadata:8585"
trinoUrl: "http://akko-trino:8080"
airflowUrl: "http://akko-api-server:8080"
jupyterhubUrl: "http://proxy-public"
jupyterhubHubUrl: "http://hub:8081"
adenUrl: "http://akko-akko-aden:8000"
opaUrl: "http://akko-akko-opa:8181"
Every URL is overridable for air-gapped / non-standard naming.
Roadmap¶
The Control Plane is Phase 1 of the broader product-over-tech plan documented in the AKKO planning repository (private). Timeline:
| Phase | Period | Goal |
|---|---|---|
| 1.1 (done) | 2026-04-22 | Service skeleton + Helm chart + 4 endpoint stubs |
| 1.2 | Sprint 42 | OPA enrichment, projection polish, Trino FQN rewrite |
| 1.3-1.5 | Sprint 42 | Full abstraction coverage (Dataset / Workspace / Pipeline mutating verbs) |
| 2 | Sprint 43-44 | Cockpit canvas + Cmd+K built on top of the Control Plane |
| 3 | Sprint 45 | Additional Agents (SQL duel, auto-remediation, auto-doc) |
See also¶
- Editions — Lab / Data / AI — the three product editions the Control Plane enables
- Overview — component map
- Security — how Keycloak + OPA apply through the Control Plane
- ADR-022 (planning repo) — decision record