AKKO vs Commercial Data Platforms¶
AKKO competes on two axes where commercial vendors fall short :
sovereignty (100% self-hosted, open source) and AI-native integration
(native Trino ai_*() functions + MCP servers out of the box).
At-a-glance matrix¶
| Capability | AKKO | Dremio | Starburst Galaxy | Databricks | Snowflake | Cloudera | Stackable |
|---|---|---|---|---|---|---|---|
| 100% self-hosted (air-gap possible) | ✅ | ✅ | ❌ SaaS only | ❌ SaaS/VPC | ❌ SaaS only | ✅ | ✅ |
| Fully Apache 2.0 open source | ✅ | Partial (BSL since v2) | Partial | ❌ Proprietary | ❌ Proprietary | Partial | ✅ |
| AI-native (LLM + vector + MCP) | ✅ 17 Trino ai_*() + 2 MCP servers |
❌ | Partial (Starburst AI add-on) | ✅ (Unity AI, paid) | ✅ (Cortex, paid) | ❌ | ❌ |
| 1-command Helm deploy | ✅ bash helm/scripts/deploy.sh |
Complex multi-step | Control plane + account binding | ❌ | ❌ | Complex | Complex (manifests manuels) |
| SSO pré-configuré (Keycloak) | ✅ 5 rôles, 13 clients | Manual SAML/OIDC | SAML/OIDC | OAuth/OIDC | OAuth | LDAP/AD | Manual |
| Fine-grained RBAC (row+column) | ✅ OPA + Keycloak | Apache Ranger (add-on) | Priv Separation (add-on) | Unity Catalog (locked in) | Row Access Policies | Apache Ranger | Manual |
| 12 demos métier turnkey | ✅ banking/healthcare/retail/telecom/manufacturing | ❌ | ❌ | Notebooks samples only | ❌ | ❌ | ❌ |
| Data lake natif (Iceberg + MinIO) | ✅ Apache Polaris 1.3 | ✅ (Dremio Arctic) | ✅ Iceberg REST | ✅ (Delta Lake, locked) | Iceberg (external) | ✅ Iceberg via HMS | ✅ Trino + Iceberg |
| Governance catalog (lineage, PII) | ✅ OpenMetadata (Apache 2.0) | Dremio Arctic (proprio) | Starburst Glossary (ltd) | Unity Catalog (locked) | Horizon (paid) | Apache Atlas | Nessie (ltd) |
| Observability stack included | ✅ Prometheus + Grafana + Loki + AlertManager + Promtail | Partial | Partial | External only | External only | Cloudera Manager | ✅ |
| 2 MCP Servers (AI agent ready) | ✅ Trino + OpenMetadata | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
| LLM RBAC matrix (role × model) | ✅ Unique: 5 roles × 4 models | ❌ | ❌ | Limited | ❌ | ❌ | ❌ |
| Air-gap deployment supported | ✅ Harbor local + docs dedicated | Partial (à configurer) | ❌ | ❌ | ❌ | ✅ | Partial |
| Backup/DR one-command | ✅ restore.sh --date --component |
Manual | Backup SaaS | Manual | Automatic (paid) | Manual | Manual |
| License cost / TCO | 0 € | $$$ (enterprise edition) | $$$$ | $$$$$ | $$$$$ | $$ | 0 € (but no support) |
| Enterprise support | 🏢 Via partners (planned) | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
| Kubernetes-first | ✅ | ✅ | ❌ (SaaS) | Managed K8s | ❌ | ✅ | ✅ |
| Works on k3s/edge | ✅ Netcup VPS validated | ❌ (needs big iron) | ❌ | ❌ | ❌ | Complex | ✅ |
| Bilingual (EN+FR) docs | ✅ 164 pages EN + 54 FR | English only | English only | English only | English only | English/multiple | English only |
AKKO's unique advantages¶
1. Native AI integration via Trino (no other vendor has this)¶
-- Run AI directly in SQL, no Python glue, no external service
SELECT
customer_id,
akko_ai_sentiment(last_review) AS sentiment,
akko_ai_classify(last_review, ARRAY['billing', 'support', 'product']) AS category,
ai_redact_pii(last_review) AS clean_text
FROM iceberg.crm.reviews
WHERE review_date > DATE '2026-01-01';
Provided out of the box by the akko-trino image with our custom plugin.
Dremio, Starburst, Cloudera : you'd build this yourself with UDFs + external
calls. Databricks/Snowflake charge per-token for their equivalent.
2. MCP Servers = your data becomes an AI agent¶
AKKO ships two MCP (Model Context Protocol) servers :
- akko-mcp-trino : exposes 8 tools (list tables, describe, run query,
ai_* functions) to any MCP-compatible client (Claude Desktop,
Cursor, Cline, etc.).
- akko-mcp-openmetadata : exposes catalog search, lineage, glossary
to AI agents.
Your data analyst types in Claude Desktop : "What's our Q1 revenue by sector?" — Claude calls MCP Trino, queries Iceberg, returns a chart. Zero integration work.
No other vendor (Dremio, Starburst, Databricks, Snowflake, Cloudera, Stackable) ships MCP servers today.
3. LLM RBAC matrix (who can use which AI model)¶
| Role | Chat | Embed | Code | GPU (vLLM) |
|---|---|---|---|---|
| admin | ✅ | ✅ | ✅ | ✅ |
| engineer | ✅ | ✅ | ✅ | ❌ |
| analyst | ✅ | ✅ | ❌ | ❌ |
| user | ✅ | ❌ | ❌ | ❌ |
| viewer | ❌ | ❌ | ❌ | ❌ |
Editable live via Cockpit Governance → LLM Access Matrix. No vendor offers this granularity of LLM access control today.
4. Harbor indépendant¶
In AKKO, the container registry lives in its own namespace, has its
own PostgreSQL and own PVC. You can helm uninstall akko and
rebuild from scratch — Harbor + all your pre-built images survive.
Other stacks tie the registry to the platform lifecycle.
5. Bilingual docs (EN + FR)¶
AKKO is the only vendor shipping 164 documentation pages in both English and French with proper UTF-8 accents. Unique for European users and compliance (RGPD documentation).
Where commercial vendors still win¶
| Dimension | Winner | Why it matters |
|---|---|---|
| Scale (petabytes+) | Snowflake / Databricks | Elastic compute without ops |
| ML ops maturity | Databricks (MLflow origin) | End-to-end ML lifecycle |
| BI ecosystem | Snowflake / Databricks | Integrations with Tableau, Looker, etc. |
| 24/7 vendor support | All commercial | Enterprise SLA |
| Certifications (SOC 2/ISO/HIPAA) | All commercial | Compliance out-of-the-box |
→ AKKO's path to close these gaps : Sprint 23 (hardening + SOC2 checklist) and partner support network.
TCO simulation (100 TB data, 50 users, 3 years)¶
| Platform | License cost (3y) | Infra (3y) | Ops team | Total |
|---|---|---|---|---|
| AKKO (Netcup 32 GB VPS ×3 + Longhorn) | 0 € | ~6 000 € | 0.5 FTE | ~150 000 € |
| Dremio Enterprise | ~180 000 € | ~40 000 € | 0.5 FTE | ~340 000 € |
| Starburst Galaxy | ~300 000 € | — (SaaS) | 0.5 FTE | ~420 000 € |
| Databricks Lakehouse | ~450 000 € (usage) | — | 0.5 FTE | ~570 000 € |
| Snowflake | ~600 000 € (credits) | — | 0.5 FTE | ~720 000 € |
| Cloudera CDP | ~240 000 € | ~60 000 € | 1 FTE | ~540 000 € |
AKKO estimate assumes : - Full-time dev-ops 0.5 (partial) engineer to maintain the cluster - Netcup 32 GB VPS × 3 at ~40 €/month = 1 440 €/year - Self-generated TLS (no Let's Encrypt bandwidth costs) - Free GitHub Actions (2 000 min/month)
Savings vs Snowflake over 3 years : ~570 000 €. Savings vs Databricks : ~420 000 €.
When NOT to choose AKKO¶
- You need petabyte-scale autoscaling compute with zero ops (→ Snowflake)
- Your team has zero Kubernetes experience and refuses to learn (→ Databricks SaaS)
- You need FedRAMP / FISMA / other gov certifications today, not in 6 months (→ commercial vendor with pre-existing cert)
- You have strict vendor lock-in requirements from a procurement policy (AKKO is open source + community)
When to choose AKKO¶
- European sovereignty mandate (RGPD, data stays on-shore)
- Healthcare HDS / defense / air-gapped environments
- Budget-constrained startup / SME / public sector
- AI-native use cases (building AI agents, RAG pipelines)
- Kubernetes-first ops culture (DevOps team comfortable with Helm)
- Hackable stack (you want to extend Trino, add your own plugins)
- Multi-sector demos needed to prove value fast to business stakeholders
References for comparison claims¶
- Dremio licensing : https://www.dremio.com/pricing (BSL switch 2024)
- Starburst Galaxy : https://www.starburst.io/platform/starburst-galaxy/ (SaaS only)
- Databricks Unity Catalog : https://docs.databricks.com/en/data-governance/unity-catalog/index.html
- Snowflake Cortex : https://docs.snowflake.com/en/user-guide/snowflake-cortex/overview
- Cloudera Data Platform : https://www.cloudera.com/products/open-data-lakehouse.html
- Stackable : https://stackable.tech/ (CNCF Sandbox)
- Apache Iceberg : https://iceberg.apache.org
- Apache Polaris : https://polaris.apache.org
- MCP protocol : https://modelcontextprotocol.io