Skip to content

Trino

Overview

Trino 480 is the distributed SQL query engine for AKKO. It provides federated queries across the Iceberg lakehouse (via Polaris), PostgreSQL, MySQL, Oracle, SQL Server, Snowflake, BigQuery, Databricks, Hive + HDFS (with Kerberos for Cloudera CDP 7.1.9), Kafka and more — 18 connectors exposed through a single SQL endpoint with dynamic catalog management: admins can add, update or remove catalogs through the cockpit Catalog Manager Pro with zero downtime (no rolling restart required).

Architecture

  Notebooks / Superset / Airflow
              |
     +--------+--------+
     |     Trino        |  coordinator + worker (single node)
     |  federation.akko.local  |
     +--------+---------+
              |
     +--------+---------+
     |                  |
  Iceberg           PostgreSQL
  (Polaris REST)    (jdbc)

Trino runs as a single-node deployment (coordinator with scheduling enabled) on port 8080 internally, exposed via Traefik at federation.akko.local.

Catalogs

AKKO ships with three built-in catalogs (shipped in the chart) and supports adding any number of client-owned catalogs at runtime via the Catalog Manager Pro cockpit page (see ADR-021 (see docs/adr/ADR-021-catalog-manager-pro.md)).

Built-in catalogs:

iceberg -- Lakehouse

Connects to the Apache Polaris REST catalog for Iceberg table management.

trino/catalog/iceberg.properties
connector.name=iceberg
iceberg.catalog.type=rest
iceberg.rest-catalog.uri=http://akko-akko-polaris:8181/api/catalog
iceberg.rest-catalog.warehouse=akko-warehouse
iceberg.rest-catalog.security=OAUTH2
iceberg.rest-catalog.oauth2.credential=root:${ENV:POLARIS_ROOT_SECRET}
iceberg.rest-catalog.oauth2.scope=PRINCIPAL_ROLE:ALL
iceberg.file-format=PARQUET
fs.native-s3.enabled=true
s3.endpoint=http://akko-minio:9000
s3.path-style-access=true
s3.aws-access-key=${ENV:MINIO_ROOT_USER}
s3.aws-secret-key=${ENV:MINIO_ROOT_PASSWORD}
s3.region=us-east-1

OAuth2 Scope

The iceberg.rest-catalog.oauth2.scope=PRINCIPAL_ROLE:ALL is required. Without it, Polaris rejects the default scope with an invalid_scope error.

postgresql -- Operational Database

Connects directly to the AKKO PostgreSQL database.

trino/catalog/postgresql.properties
connector.name=postgresql
connection-url=jdbc:postgresql://akko-postgresql:5432/akko
connection-user=akko
connection-password=${ENV:POSTGRES_AKKO_PASSWORD}

system -- Trino Internal

Built-in catalog exposing Trino runtime information (nodes, queries, tasks). Available to all roles in read-only mode.

Federation

Trino's key strength is cross-catalog federation -- querying Iceberg and PostgreSQL tables in a single SQL statement:

-- Join Iceberg lakehouse data with PostgreSQL operational data
SELECT
    i.customer_id,
    i.total_amount,
    p.email
FROM iceberg.analytics.transactions AS i
JOIN postgresql.public.customers AS p
    ON i.customer_id = p.id;

RBAC: File-Based Access Control

Trino uses file-based access control with five roles, refreshed every 5 minutes. For the full RBAC guide (Keycloak, OPA, column masking, row filtering), see the RBAC Administration Guide.

Configuration Files

File Purpose
trino/etc/access-control.properties Enables file-based access control
trino/etc/rules.json Catalog, schema, and table permission rules
trino/etc/group-provider.properties Enables file-based group provider
trino/etc/group.txt User-to-group mappings

Group Memberships

trino/etc/group.txt
akko-admin:admin,alice,trino,airflow
akko-engineer:bob
akko-analyst:carol
akko-user:eve
akko-viewer:dave

Service Users

The trino and airflow users must be in the akko-admin group. Trino uses the trino user for internal operations, and Airflow connects as airflow to run pipeline queries.

Permission Matrix

Catalog Access

Role Iceberg PostgreSQL System
akko-admin full full full
akko-engineer full full read-only
akko-analyst read-only read-only read-only
akko-user read-only (PII masked) read-only read-only
akko-viewer read-only -- read-only
(default) -- -- read-only

Schema Access (Iceberg)

Role Schemas Owner
akko-admin all yes
akko-engineer raw, staging, analytics, sandbox yes
akko-analyst all (read) no
akko-user all (read, PII masked) no
akko-viewer analytics, reporting, public no

Table Privileges (Iceberg)

Role Schemas Privileges
akko-admin all SELECT, INSERT, DELETE, UPDATE, OWNERSHIP, GRANT_SELECT
akko-engineer raw, staging, analytics, sandbox SELECT, INSERT, DELETE, UPDATE
akko-analyst all SELECT
akko-user all SELECT (PII masked via OPA)
akko-viewer analytics, reporting, public SELECT (row-filtered, PII masked)

Rules Evaluation

Rules in rules.json are evaluated top-to-bottom, first match wins. The file contains three sections: catalogs, schemas, and tables. A catch-all deny rule at the end of the catalogs section blocks any unmatched access:

{
  "catalog": ".*",
  "allow": "none"
}

Connecting from Notebooks

from trino.dbapi import connect

conn = connect(
    host="akko-trino",
    port=8080,
    user="alice",         # Keycloak username
    catalog="iceberg",
    schema="analytics",
)
cursor = conn.cursor()
cursor.execute("SELECT * FROM transactions LIMIT 10")
rows = cursor.fetchall()

User Identity

Always use the Keycloak username (e.g., alice, bob) as the user parameter. This determines which RBAC group applies. Never use akko-admin or trino as the user from notebooks.

Connecting from Superset

Superset connects to Trino using the trino user, which is a member of akko-admin and therefore has full access to all catalogs and schemas.

Connection string format:

trino://alice@akko-trino:8080/iceberg

Configuration Reference

File Setting Value
config.properties coordinator true
config.properties node-scheduler.include-coordinator true
config.properties http-server.http.port 8080
config.properties http-server.process-forwarded true
config.properties web-ui.authentication.type FIXED
access-control.properties security.refresh-period 5m
group-provider.properties file.refresh-period 5m

Known Issues

Important Gotchas

  • akko-admin group must include trino and airflow: These service users need admin-level access for internal operations and pipeline queries.
  • Trino password persists in PG volume: If you change the PostgreSQL password in .env, the old password is still stored in the Postgres volume. Fix with ALTER USER akko WITH PASSWORD '...'.
  • Polaris invalid_scope: The Iceberg catalog must set iceberg.rest-catalog.oauth2.scope=PRINCIPAL_ROLE:ALL explicitly.
  • DROP TABLE forbidden by Polaris: Use CALL iceberg.system.unregister_table('schema', 'table') instead of DROP TABLE.
  • Dynamic secrets: Trino uses ${ENV:VARIABLE} syntax to read secrets from environment variables at startup. Never hardcode credentials in .properties files.