Trino¶
Overview¶
Trino 480 is the distributed SQL query engine for AKKO. It provides federated queries across the Iceberg lakehouse (via Polaris), PostgreSQL, MySQL, Oracle, SQL Server, Snowflake, BigQuery, Databricks, Hive + HDFS (with Kerberos for Cloudera CDP 7.1.9), Kafka and more — 18 connectors exposed through a single SQL endpoint with dynamic catalog management: admins can add, update or remove catalogs through the cockpit Catalog Manager Pro with zero downtime (no rolling restart required).
Architecture¶
Notebooks / Superset / Airflow
|
+--------+--------+
| Trino | coordinator + worker (single node)
| federation.akko.local |
+--------+---------+
|
+--------+---------+
| |
Iceberg PostgreSQL
(Polaris REST) (jdbc)
Trino runs as a single-node deployment (coordinator with scheduling enabled)
on port 8080 internally, exposed via Traefik at federation.akko.local.
Catalogs¶
AKKO ships with three built-in catalogs (shipped in the chart) and supports
adding any number of client-owned catalogs at runtime via the Catalog Manager
Pro cockpit page (see ADR-021 (see docs/adr/ADR-021-catalog-manager-pro.md)).
Built-in catalogs:
iceberg -- Lakehouse¶
Connects to the Apache Polaris REST catalog for Iceberg table management.
connector.name=iceberg
iceberg.catalog.type=rest
iceberg.rest-catalog.uri=http://akko-akko-polaris:8181/api/catalog
iceberg.rest-catalog.warehouse=akko-warehouse
iceberg.rest-catalog.security=OAUTH2
iceberg.rest-catalog.oauth2.credential=root:${ENV:POLARIS_ROOT_SECRET}
iceberg.rest-catalog.oauth2.scope=PRINCIPAL_ROLE:ALL
iceberg.file-format=PARQUET
fs.native-s3.enabled=true
s3.endpoint=http://akko-minio:9000
s3.path-style-access=true
s3.aws-access-key=${ENV:MINIO_ROOT_USER}
s3.aws-secret-key=${ENV:MINIO_ROOT_PASSWORD}
s3.region=us-east-1
OAuth2 Scope
The iceberg.rest-catalog.oauth2.scope=PRINCIPAL_ROLE:ALL is required.
Without it, Polaris rejects the default scope with an invalid_scope error.
postgresql -- Operational Database¶
Connects directly to the AKKO PostgreSQL database.
connector.name=postgresql
connection-url=jdbc:postgresql://akko-postgresql:5432/akko
connection-user=akko
connection-password=${ENV:POSTGRES_AKKO_PASSWORD}
system -- Trino Internal¶
Built-in catalog exposing Trino runtime information (nodes, queries, tasks). Available to all roles in read-only mode.
Federation¶
Trino's key strength is cross-catalog federation -- querying Iceberg and PostgreSQL tables in a single SQL statement:
-- Join Iceberg lakehouse data with PostgreSQL operational data
SELECT
i.customer_id,
i.total_amount,
p.email
FROM iceberg.analytics.transactions AS i
JOIN postgresql.public.customers AS p
ON i.customer_id = p.id;
RBAC: File-Based Access Control¶
Trino uses file-based access control with five roles, refreshed every 5 minutes. For the full RBAC guide (Keycloak, OPA, column masking, row filtering), see the RBAC Administration Guide.
Configuration Files¶
| File | Purpose |
|---|---|
trino/etc/access-control.properties |
Enables file-based access control |
trino/etc/rules.json |
Catalog, schema, and table permission rules |
trino/etc/group-provider.properties |
Enables file-based group provider |
trino/etc/group.txt |
User-to-group mappings |
Group Memberships¶
akko-admin:admin,alice,trino,airflow
akko-engineer:bob
akko-analyst:carol
akko-user:eve
akko-viewer:dave
Service Users
The trino and airflow users must be in the akko-admin group.
Trino uses the trino user for internal operations, and Airflow connects
as airflow to run pipeline queries.
Permission Matrix¶
Catalog Access¶
| Role | Iceberg | PostgreSQL | System |
|---|---|---|---|
| akko-admin | full | full | full |
| akko-engineer | full | full | read-only |
| akko-analyst | read-only | read-only | read-only |
| akko-user | read-only (PII masked) | read-only | read-only |
| akko-viewer | read-only | -- | read-only |
| (default) | -- | -- | read-only |
Schema Access (Iceberg)¶
| Role | Schemas | Owner |
|---|---|---|
| akko-admin | all | yes |
| akko-engineer | raw, staging, analytics, sandbox |
yes |
| akko-analyst | all (read) | no |
| akko-user | all (read, PII masked) | no |
| akko-viewer | analytics, reporting, public |
no |
Table Privileges (Iceberg)¶
| Role | Schemas | Privileges |
|---|---|---|
| akko-admin | all | SELECT, INSERT, DELETE, UPDATE, OWNERSHIP, GRANT_SELECT |
| akko-engineer | raw, staging, analytics, sandbox |
SELECT, INSERT, DELETE, UPDATE |
| akko-analyst | all | SELECT |
| akko-user | all | SELECT (PII masked via OPA) |
| akko-viewer | analytics, reporting, public |
SELECT (row-filtered, PII masked) |
Rules Evaluation¶
Rules in rules.json are evaluated top-to-bottom, first match wins. The
file contains three sections: catalogs, schemas, and tables. A catch-all
deny rule at the end of the catalogs section blocks any unmatched access:
Connecting from Notebooks¶
from trino.dbapi import connect
conn = connect(
host="akko-trino",
port=8080,
user="alice", # Keycloak username
catalog="iceberg",
schema="analytics",
)
cursor = conn.cursor()
cursor.execute("SELECT * FROM transactions LIMIT 10")
rows = cursor.fetchall()
User Identity
Always use the Keycloak username (e.g., alice, bob) as the user
parameter. This determines which RBAC group applies. Never use akko-admin
or trino as the user from notebooks.
Connecting from Superset¶
Superset connects to Trino using the trino user, which is a member of
akko-admin and therefore has full access to all catalogs and schemas.
Connection string format:
Configuration Reference¶
| File | Setting | Value |
|---|---|---|
config.properties |
coordinator |
true |
config.properties |
node-scheduler.include-coordinator |
true |
config.properties |
http-server.http.port |
8080 |
config.properties |
http-server.process-forwarded |
true |
config.properties |
web-ui.authentication.type |
FIXED |
access-control.properties |
security.refresh-period |
5m |
group-provider.properties |
file.refresh-period |
5m |
Known Issues¶
Important Gotchas
akko-admingroup must includetrinoandairflow: These service users need admin-level access for internal operations and pipeline queries.- Trino password persists in PG volume: If you change the PostgreSQL
password in
.env, the old password is still stored in the Postgres volume. Fix withALTER USER akko WITH PASSWORD '...'. - Polaris
invalid_scope: The Iceberg catalog must seticeberg.rest-catalog.oauth2.scope=PRINCIPAL_ROLE:ALLexplicitly. DROP TABLEforbidden by Polaris: UseCALL iceberg.system.unregister_table('schema', 'table')instead ofDROP TABLE.- Dynamic secrets: Trino uses
${ENV:VARIABLE}syntax to read secrets from environment variables at startup. Never hardcode credentials in.propertiesfiles.