Skip to content

Catalog Manager Pro

AKKO Catalog Manager Pro is the self-service backend + cockpit page that lets admins register new Trino catalogs at runtime. It pairs with Trino 480 dynamic catalog management (catalog.management=dynamic) so adding or removing a catalog incurs zero downtime and no helm upgrade.

See the full design in ADR-021 (see docs/adr/ADR-021-catalog-manager-pro.md).

Supported connectors (15)

Category Connectors
Relational PostgreSQL, MySQL, SQL Server, Oracle
Lakehouse Iceberg (REST), Hive + HDFS + Kerberos (Cloudera CDP 7.1.9), Delta Lake
Cloud DW Snowflake, BigQuery, Redshift, Databricks
NoSQL MongoDB, Cassandra, Elasticsearch
Streaming Kafka

Architecture

Cockpit Admin → Catalogs page
    │ HTTPS + Keycloak JWT
nginx (akko-cockpit)  /api/cockpit/catalog-manager/
    │  forwards X-User-Id / Authorization Bearer
FastAPI backend (akko-catalog-manager, Apache 2.0 stack)
    │   1. verify JWT → require akko-admin role
    │   2. dry-run connection via Trino /v1/catalog/test
    │   3. store credentials in a K8s Secret (never ConfigMap)
    │   4. POST /v1/catalog to Trino 480 — dynamic, no restart
    │   5. generate OPA rule fragment (akko-opa watcher picks it up)
    │   6. trigger OpenMetadata ingestion (best-effort)
    │   7. emit structured audit JSON → logs layer
Trino 480  +  OpenMetadata  +  OPA

Add a catalog (cockpit)

  1. Log in to the cockpit as a user with the akko-admin role.
  2. Open the sidebar entry Data Catalogs.
  3. Click + Add catalog.
  4. Fill the form:
  5. Name (lower-case, [a-z0-9-], ≤ 40 chars). E.g. client-pg-01.
  6. Connector: choose from the dropdown.
  7. Host / Port / Database / User / Password: connection parameters.
  8. Allowed roles: comma-separated Keycloak realm roles allowed to query this catalog (OPA auto-rule). Default: akko-admin,akko-engineer,akko-analyst.
  9. Click Test connection to dry-run.
  10. Click Register catalog — the catalog is live in Trino in < 2 s.

Add a catalog (API)

curl -X POST https://cockpit.my-company.example/api/cockpit/catalog-manager/api/admin/catalogs \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "client-pg-01",
    "connector": "postgresql",
    "properties": {
      "connection-url": "jdbc:postgresql://db.customer.example:5432/sales",
      "connection-user": "trino_readonly"
    },
    "credentials": {
      "connection-password": "<redacted>"
    },
    "description": "Customer sales read-only mirror",
    "allow_roles": ["akko-admin", "akko-engineer", "akko-analyst"]
  }'

Response: 201 Created with the catalog metadata. Trino sees it immediately — run SHOW SCHEMAS FROM "client-pg-01" to verify.

Hive + HDFS + Kerberos (Cloudera CDP 7.1.9) — end-to-end

Fully automated since v2026.04 — no operator ever has to touch YAML, kubectl, or the Trino coordinator. Below is the complete chain the backend executes when you submit the Kerberos form.

Backend flow

  1. The cockpit form encodes the keytab as base64 and sends it along with the pasted krb5.conf under credentials.kerberos.*.
  2. akko-catalog-manager reads those fields:
  3. Base64-decodes the keytab, merges it into the shared akko-catalog-keytabs Secret under the key <catalog>.keytab (read-modify-write; multiple Kerberos catalogs coexist).
  4. Stores the krb5.conf content in the shared akko-catalog-krb5-conf ConfigMap.
  5. The Trino coordinator already mounts both objects at /etc/security/keytabs/ and /etc/krb5.conf (see helm/akko/values-trino.yaml). Kubernetes refreshes the mount automatically — typically within 30–60 s.
  6. The backend sleeps 45 s (configurable), then runs a Trino dry-run. If kinit fails the whole operation rolls back.
  7. Trino catalog properties are enriched automatically:
  8. hive.metastore.authentication.type = KERBEROS
  9. hive.hdfs.authentication.type = KERBEROS
  10. hive.hdfs.impersonation.enabled = true
  11. hive.metastore.client.principal = akko-trino@$AKKO_KERBEROS_REALM
  12. hive.metastore.client.keytab = /etc/security/keytabs/<catalog>.keytab
  13. OPA live rule + OpenMetadata ingestion follow the same path as non-Kerberos catalogs.

Form fields expected from the operator

Field Example Notes
Metastore URI thrift://hms.customer.example:9083 Reachable from the Trino pod
Service principal hive/_HOST@CLIENT.LOCAL _HOST expands per HMS node
Keytab (file upload) akko-trino.keytab Must contain akko-trino@CLIENT.LOCAL
krb5.conf (textarea) content of /etc/krb5.conf Realms + KDC hostnames
Allowed roles akko-admin, akko-engineer, akko-analyst OPA auto-rule

Propagation window — 45 s

Kubernetes takes up to 60 s to make a newly-updated Secret visible inside a mounted pod. The backend waits 45 s before the dry-run; the operator sees a single spinner. Override with AKKO_KEYTAB_PROPAGATION_SECONDS=60 on the catalog-manager Deployment.

Using the demo cluster

demos/cloudera-simulation/ ships a full Kerberized Cloudera-like stack runnable on the same Netcup host. Start it with docker compose up -d && ./seed/seed.sh, then either:

  • Upload exports/akko-trino.keytab and paste exports/krb5.conf in the cockpit form, or
  • Run AKKO_ADMIN_TOKEN=<jwt> demos/cloudera-simulation/bootstrap-akko-catalog.sh which posts the same payload to the REST API.

The end-to-end pipeline is exercised by tests/post-deploy/05-cloudera-federation-e2e.sh (PASS/FAIL with partition-pruning assertion).

List / delete a catalog

From the cockpit: the table shows every registered catalog, its health, and a Remove button. Removal deregisters the catalog from Trino and drops the K8s Secret.

From the API:

curl -H "Authorization: Bearer $TOKEN" \
  https://cockpit.my-company.example/api/cockpit/catalog-manager/api/admin/catalogs
# → JSON array of {name, connector, allow_roles, health}

curl -X DELETE -H "Authorization: Bearer $TOKEN" \
  https://cockpit.my-company.example/api/cockpit/catalog-manager/api/admin/catalogs/client-pg-01
# → 204 No Content

Observability

Prometheus / VictoriaMetrics scrapes /metrics every 30 s:

  • akko_catalog_operations_total{operation="create|delete|test", result="success|failed"}
  • akko_catalog_operation_duration_seconds{operation=...}

Audit JSON (logs layer / logs layer):

{
  "audit_type": "CATALOG_MANAGER",
  "event": "CATALOG_CREATED",
  "name": "client-pg-01",
  "connector": "postgresql",
  "user": "alice",
  "allow_roles": ["akko-admin", "akko-engineer", "akko-analyst"],
  "timestamp": 1713542400.123
}

Troubleshooting

Symptom Likely cause Fix
401 on POST Not logged in as akko-admin Check JWT role via cockpit user-menu
400 "connection test failed" Wrong host / port / creds / firewall Run the test from a Trino pod: kubectl exec ... -- nc -zv <host> <port>
Catalog appears in cockpit but not in Trino Trino coordinator not on 480, or catalog.management=static Bump values-trino.yaml image to 480 + add catalog.management=dynamic
OpenMetadata empty for new catalog Ingestion endpoint unreachable Re-trigger via POST /api/admin/catalogs/{name}/reindex (roadmap)

Commercial consideration

Catalog Manager Pro is a first-class AKKO feature that replaces Starburst Mission Control for self-service data onboarding. All components are under permissive licenses (Apache 2.0 FastAPI, Apache 2.0 Trino 480, MIT httpx/pydantic) — safe for commercial redistribution.