Skip to content

Apache Superset

Business intelligence, dashboards, and SQL exploration.

URL https://bi.akko.local
Authentication Keycloak SSO (akko realm)
Helm sub-chart superset (Apache community chart)

Overview

Apache Superset is the BI layer of the AKKO platform. It connects to Trino to query Iceberg tables and PostgreSQL data, providing:

  • Interactive chart building with a drag-and-drop interface
  • SQL Lab for ad-hoc queries
  • Pre-built dashboards auto-provisioned at startup
  • Support for query federation (Iceberg + PostgreSQL in a single chart)

Auto-Provisioned Content

AKKO includes a superset-init sidecar that automatically provisions a complete demo environment on first startup. No manual setup is required.

What Gets Created

Category Count Details
Database connection 1 Trino (AKKO) -- SQLAlchemy URI: trino://trino@trino:8080/iceberg
Physical datasets 4 customers, accounts, transactions, advisors (from iceberg.analytics.*)
Virtual datasets 4 balance_by_account_type, monthly_volume_by_operation, spending_by_category, revenue_by_branch
Charts 8 KPIs, bar charts, line charts, pie chart, tables
Dashboard 1 AKKO Banking Overview

Charts Breakdown

Chart Type Dataset
Total Customers Big Number customers
Active Balance Big Number accounts
Balance by Account Type Bar balance_by_account_type (virtual)
Monthly Transaction Volume Line monthly_volume_by_operation (virtual)
Spending by Category Donut Pie spending_by_category (virtual)
Revenue by Branch (Federation) Bar revenue_by_branch (virtual -- joins Iceberg + PostgreSQL)
Top 10 Customers by Volume Table customers
Advisor Performance Table advisors

Data must exist first

The auto-provisioning script creates datasets that reference Iceberg tables. If the tables do not exist yet (the banking demo notebook has not been run), datasets are skipped with a warning. Run the akko-banking-demo.ipynb notebook first, then restart Superset to complete provisioning.

Federation Example

The Revenue by Branch chart demonstrates Trino's query federation -- it joins data from the Iceberg catalog (iceberg.analytics.customers, iceberg.analytics.accounts, iceberg.analytics.transactions) with PostGIS data from PostgreSQL (postgresql.geospatial.branches) in a single SQL query.


Trino Connection

Superset connects to Trino using the following configuration:

Database name:   Trino (AKKO)
SQLAlchemy URI:  trino://trino@trino:8080/iceberg

This connection provides access to all Trino catalogs:

Catalog Backend Contents
iceberg Apache Polaris (Iceberg REST) Analytics tables (customers, accounts, transactions, advisors)
postgresql PostgreSQL Geospatial data (branches with PostGIS), pgvector
system Trino internal Runtime metadata, query history

SQL Lab

SQL Lab is Superset's interactive SQL editor. Access it from the top menu: SQL > SQL Lab.

Features:

  • Multi-tab SQL editor with syntax highlighting
  • Query history and saved queries
  • Table schema explorer (left sidebar)
  • Export results to CSV

Example Query

-- Revenue by region from Iceberg tables
SELECT c.segment,
       COUNT(DISTINCT c.customer_id) AS customers,
       ROUND(SUM(ABS(t.amount)), 2) AS total_volume
FROM iceberg.analytics.customers c
JOIN iceberg.analytics.accounts a ON a.customer_id = c.customer_id
JOIN iceberg.analytics.transactions t ON t.account_id = a.account_id
WHERE a.status = 'active'
GROUP BY c.segment
ORDER BY total_volume DESC;

Pre-built SQL queries

The file superset/assets/akko-banking-queries.sql contains ready-to-use SQL queries for SQL Lab exploration.


Creating Custom Charts

  1. Navigate to Charts > + Chart
  2. Select a dataset (or create a new one from SQL Lab)
  3. Choose a visualization type (bar, line, pie, table, etc.)
  4. Configure metrics and dimensions
  5. Click Save and optionally add to a dashboard

Creating a Custom Dataset

A physical dataset maps directly to a table:

  1. Go to Datasets > + Dataset
  2. Select database Trino (AKKO)
  3. Select schema (e.g., analytics)
  4. Select table
  5. Click Add

A virtual dataset is defined by a SQL query:

  1. Write a query in SQL Lab
  2. Click Save > Save as Dataset
  3. The query results become a reusable dataset for charts

Authentication

Superset uses Keycloak SSO for authentication. Users log in through the AKKO realm and are mapped to Superset roles based on their Keycloak group membership.


Known Issues

Metrics on virtual datasets must use aggregate functions

When creating charts on virtual datasets (SQL-defined), metrics must use aggregate functions like SUM(col), COUNT(*), or AVG(col). Using a bare column name (e.g., total_balance instead of SUM(total_balance)) will fail with a Trino EXPRESSION_NOT_AGGREGATE error.

Dashboard chart association

The Superset REST API's POST endpoint for dashboards does not create the many-to-many relationship between dashboards and charts. The auto-provisioning script works around this by using the Superset ORM directly (dash.slices = slices). If you create dashboards via the REST API, charts may appear unlinked -- re-save the dashboard in the UI to fix.

superset-init runs at every startup

The provisioning script is idempotent. It checks for existing databases, datasets, charts, and dashboards before creating new ones. Restarting Superset will not duplicate content.