Apache Superset¶
Business intelligence, dashboards, and SQL exploration.
| URL | https://bi.akko.local |
| Authentication | Keycloak SSO (akko realm) |
| Helm sub-chart | superset (Apache community chart) |
Overview¶
Apache Superset is the BI layer of the AKKO platform. It connects to Trino to query Iceberg tables and PostgreSQL data, providing:
- Interactive chart building with a drag-and-drop interface
- SQL Lab for ad-hoc queries
- Pre-built dashboards auto-provisioned at startup
- Support for query federation (Iceberg + PostgreSQL in a single chart)
Auto-Provisioned Content¶
AKKO includes a superset-init sidecar that automatically provisions a complete demo environment on first startup. No manual setup is required.
What Gets Created¶
| Category | Count | Details |
|---|---|---|
| Database connection | 1 | Trino (AKKO) -- SQLAlchemy URI: trino://trino@trino:8080/iceberg |
| Physical datasets | 4 | customers, accounts, transactions, advisors (from iceberg.analytics.*) |
| Virtual datasets | 4 | balance_by_account_type, monthly_volume_by_operation, spending_by_category, revenue_by_branch |
| Charts | 8 | KPIs, bar charts, line charts, pie chart, tables |
| Dashboard | 1 | AKKO Banking Overview |
Charts Breakdown¶
| Chart | Type | Dataset |
|---|---|---|
| Total Customers | Big Number | customers |
| Active Balance | Big Number | accounts |
| Balance by Account Type | Bar | balance_by_account_type (virtual) |
| Monthly Transaction Volume | Line | monthly_volume_by_operation (virtual) |
| Spending by Category | Donut Pie | spending_by_category (virtual) |
| Revenue by Branch (Federation) | Bar | revenue_by_branch (virtual -- joins Iceberg + PostgreSQL) |
| Top 10 Customers by Volume | Table | customers |
| Advisor Performance | Table | advisors |
Data must exist first
The auto-provisioning script creates datasets that reference Iceberg tables. If the tables do not exist yet (the banking demo notebook has not been run), datasets are skipped with a warning. Run the akko-banking-demo.ipynb notebook first, then restart Superset to complete provisioning.
Federation Example¶
The Revenue by Branch chart demonstrates Trino's query federation -- it joins data from the Iceberg catalog (iceberg.analytics.customers, iceberg.analytics.accounts, iceberg.analytics.transactions) with PostGIS data from PostgreSQL (postgresql.geospatial.branches) in a single SQL query.
Trino Connection¶
Superset connects to Trino using the following configuration:
This connection provides access to all Trino catalogs:
| Catalog | Backend | Contents |
|---|---|---|
iceberg |
Apache Polaris (Iceberg REST) | Analytics tables (customers, accounts, transactions, advisors) |
postgresql |
PostgreSQL | Geospatial data (branches with PostGIS), pgvector |
system |
Trino internal | Runtime metadata, query history |
SQL Lab¶
SQL Lab is Superset's interactive SQL editor. Access it from the top menu: SQL > SQL Lab.
Features:
- Multi-tab SQL editor with syntax highlighting
- Query history and saved queries
- Table schema explorer (left sidebar)
- Export results to CSV
Example Query¶
-- Revenue by region from Iceberg tables
SELECT c.segment,
COUNT(DISTINCT c.customer_id) AS customers,
ROUND(SUM(ABS(t.amount)), 2) AS total_volume
FROM iceberg.analytics.customers c
JOIN iceberg.analytics.accounts a ON a.customer_id = c.customer_id
JOIN iceberg.analytics.transactions t ON t.account_id = a.account_id
WHERE a.status = 'active'
GROUP BY c.segment
ORDER BY total_volume DESC;
Pre-built SQL queries
The file superset/assets/akko-banking-queries.sql contains ready-to-use SQL queries for SQL Lab exploration.
Creating Custom Charts¶
- Navigate to Charts > + Chart
- Select a dataset (or create a new one from SQL Lab)
- Choose a visualization type (bar, line, pie, table, etc.)
- Configure metrics and dimensions
- Click Save and optionally add to a dashboard
Creating a Custom Dataset¶
A physical dataset maps directly to a table:
- Go to Datasets > + Dataset
- Select database Trino (AKKO)
- Select schema (e.g.,
analytics) - Select table
- Click Add
A virtual dataset is defined by a SQL query:
- Write a query in SQL Lab
- Click Save > Save as Dataset
- The query results become a reusable dataset for charts
Authentication¶
Superset uses Keycloak SSO for authentication. Users log in through the AKKO realm and are mapped to Superset roles based on their Keycloak group membership.
Known Issues¶
Metrics on virtual datasets must use aggregate functions
When creating charts on virtual datasets (SQL-defined), metrics must use aggregate functions like SUM(col), COUNT(*), or AVG(col). Using a bare column name (e.g., total_balance instead of SUM(total_balance)) will fail with a Trino EXPRESSION_NOT_AGGREGATE error.
Dashboard chart association
The Superset REST API's POST endpoint for dashboards does not create the many-to-many relationship between dashboards and charts. The auto-provisioning script works around this by using the Superset ORM directly (dash.slices = slices). If you create dashboards via the REST API, charts may appear unlinked -- re-save the dashboard in the UI to fix.
superset-init runs at every startup
The provisioning script is idempotent. It checks for existing databases, datasets, charts, and dashboards before creating new ones. Restarting Superset will not duplicate content.