Skip to content

AKKO Editions

"Is it a lab? A data engineering platform? An AI platform?" — every data architect evaluating AKKO asks the same three questions. This page answers them.

AKKO is one sovereign control plane that deploys as three progressive editions. Pick one, upgrade later — each edition is a strict superset of the previous one, same chart, same operator commands, same on-prem sovereignty.

The three editions at a glance

Lab Data Platform AI Platform
Positioning Sovereign sandbox Databricks/Snowflake alternative Databricks Genie / Snowflake Copilot alternative
Primary persona Data scientist Data / platform engineer Business analyst + data engineer + AI product owner
Core loop "Explore a dataset, notebook, publish" "Ingest, transform, orchestrate, govern, visualise" "Ask a question in natural language, get a governed answer + dashboard"
Time-to-first-value 30 min 1 day 1 day + verified queries
Bill-of-materials ~8 components ~20 components ~30+ components
Airgap-ready Yes Yes Yes (Ollama + vLLM on-prem)

Edition 1 — AKKO Lab

For: a small team (1-5 engineers) that needs object storage + a catalog + a notebook + a SQL engine, but refuses to adopt a cloud warehouse.

Stack: MinIO + Polaris (Iceberg REST catalog) + JupyterHub + Trino + Keycloak SSO + Cockpit.

Deploy with the layer-0-infra + layer-1-auth + layer-2-data profile. You get S3-compatible storage, an Iceberg table format, a federated SQL engine, SSO, and a notebook UI — everything to explore a dataset, run ad-hoc queries, and publish a notebook.

What Lab is not: no orchestration, no BI, no AI. It is the sovereign starting point.

Edition 2 — AKKO Data Platform

For: a data team running production ELT, dashboards, quality monitoring, and lineage — the typical "Databricks / Snowflake on-prem" ask.

Adds to Lab: Airflow + dbt (+ OpenLineage) + Superset + Spark Connect + OpenMetadata + OPA + Grafana/Prometheus/Loki.

Deploy with profiles layer-3-analytics + layer-4-ai (AI flag off) + layer-5-monitoring + layer-6-governance. You get production orchestration (DAGs), transformations-as-code (dbt models), federated BI, distributed compute (Spark Connect), active metadata (lineage, glossary, data products), row-level + column-mask policies (OPA), and full observability.

What Data Platform is not: no generative AI, no natural-language querying. The catalog is active (lineage, PII tags) but the queries still come from humans.

Edition 3 — AKKO AI Platform

For: organisations that want Databricks Genie / Snowflake Copilot without sending their data or their prompts outside the perimeter.

Adds to Data Platform: ADEN (natural language → SQL → dashboard) + LiteLLM (router) + Ollama + vLLM (GPU inference) + MCP servers (Trino + OpenMetadata) + 21 Trino ai_* scalar UDFs + Streamlit dashboard publisher + Whisper (audio) + Docling (PDF/DOCX parser).

Deploy the full umbrella. ADEN answers questions like "Top 10 regions by transaction volume last week" with a reasoning trace (pipeline_steps), an OPA-governed SQL plan, a cost check via EXPLAIN (TYPE IO), and a Streamlit dashboard — all on-prem, with a local LLM. The 21 ai_* Trino functions (akko_ai_sentiment, akko_ai_classify, akko_ai_pii, akko_ai_parse_document, akko_ai_ocr…) run inference inside federated SQL, so a business user can write SELECT akko_ai_sentiment(review) FROM feedback.

What AI Platform is not: it is not a GPU cluster manager or an MLOps foundry — MLflow is there for experiment tracking, but the positioning is governed generative AI over your own data, not training from scratch.

Deciding which edition fits

Three quick tests:

Question If YES → you need at least…
Do you have production pipelines that must run on a schedule? Data Platform
Do stakeholders ask questions in plain English and expect answers from data? AI Platform
Do you only need a place to land data and explore it? Lab is enough

You can start at Lab and flip Data Platform or AI Platform on later — the chart values only toggle enabled: true. No migration, no re-ingestion.

See also