Skip to content

Banking Fraud Demo

The banking-fraud demo is AKKO's reference end-to-end use case. It exercises every layer of the platform — ingestion, lakehouse, compute, ML, governance, BI, AI — in under 10 minutes on a laptop-sized k3d cluster, 100 % offline.

What It Proves

  • Airflow → Iceberg (Polaris) → Trino federation → dbt marts
  • MLflow experiment tracking + model registry backed by MinIO
  • OPA column masking on PII (card_pan, customer_name) per Keycloak role
  • ADEN natural-language question answering on the curated layer
  • Trino AI functions (akko_ai_anomaly, akko_ai_sentiment) inside a single SQL
  • Superset dashboard auto-provisioned by the akko-init job
  • Tempo end-to-end tracing of the Airflow → Trino → ADEN span

Architecture

flowchart LR
    GEN[generate_transactions.py<br/>Faker + PostgreSQL seed]
    AF[Airflow DAG<br/>banking_fraud_demo]
    SPK[Spark Connect<br/>JDBC → Iceberg MERGE]
    POL[Polaris<br/>Iceberg REST catalog]
    MINIO[(MinIO<br/>akko-warehouse)]
    TRI[Trino 480<br/>+ ai_* plugin]
    DBT[dbt Core<br/>marts/fct_transactions]
    ML[MLflow<br/>Isolation Forest]
    OPA[OPA<br/>PII mask per role]
    SUP[Superset dashboard<br/>AKKO Banking Fraud]
    ADEN[ADEN<br/>NL → SQL → Streamlit]

    GEN --> AF --> SPK --> POL --> MINIO
    POL --> TRI --> DBT --> POL
    TRI --> ML
    TRI --> SUP
    TRI --> ADEN
    OPA -.policy.-> TRI

Run It

# 1. Deploy AKKO on k3d
bash helm/scripts/generate-domain-values.sh akko.local
bash helm/scripts/generate-dev-secrets.sh
bash helm/scripts/deploy.sh

# 2. Trigger the demo DAG
kubectl exec -n akko deploy/akko-airflow-scheduler -- \
  airflow dags trigger banking_fraud_demo

# 3. Watch the run
kubectl exec -n akko deploy/akko-airflow-scheduler -- \
  airflow dags list-runs -d banking_fraud_demo

# 4. Open Superset
open https://bi.akko.local/superset/dashboard/akko-banking-fraud/

Pipeline Steps

  1. Seedgenerate_transactions.py inserts 100 000 transactions into postgresql.akko.raw_transactions using Faker.
  2. Replicate to Iceberg — Spark Connect reads from PostgreSQL and does MERGE INTO iceberg.banking.transactions partitioned by region.
  3. dbt modelsdbt run --models marts.fct_transactions marts.dim_customers rebuilds curated layers.
  4. Score anomalies — an Isolation Forest is trained in a JupyterHub notebook, logged to MLflow with artifacts on MinIO, and registered as banking-fraud/production.
  5. Enrich with AI — a single SQL runs akko_ai_anomaly(amount, ...) and akko_ai_sentiment(memo) to enrich each row.
  6. Refresh dashboard — Superset dataset is backed by the Trino view; cache invalidates on DAG success.
  7. Ask ADEN — "Which regions saw a fraud rate above 5 % last week?"

RBAC Matrix on iceberg.banking.transactions

Column akko-admin akko-engineer akko-analyst akko-user akko-viewer
customer_id clear clear clear clear clear
customer_name clear clear clear ***MASKED*** ***MASKED***
card_pan clear clear ***MASKED*** ***MASKED*** ***MASKED***
amount clear clear clear clear clear
region clear clear clear clear row filter (EU/APAC)
fraud_score clear clear clear clear clear

Row filter: akko-viewer only sees EU/APAC regions with status = 'active'.

Files in the Repo

File Role
airflow/dags/akko_banking_fraud_demo.py Airflow DAG orchestrating the full pipeline
airflow/scripts/generate_transactions.py Faker-powered synthetic transactions
notebooks/akko-banking-demo.ipynb Isolation Forest training, MLflow logging
dbt/models/marts/fct_transactions.sql Fact table over the Iceberg base
superset/assets/bootstrap_dashboard.py Auto-provisioning of dataset + charts + dashboard
keycloak/realm-akko.json Keycloak roles + clients used by the demo

Troubleshooting (Top 5)

Symptom Fix
DAG stuck at generate_transactions Verify akko-postgresql-data is Running; seed user alice exists
Spark job fails with NoClassDefFoundError: Iceberg Custom akko-spark:2026.03 image not pulled; bash helm/scripts/build-images.sh
MERGE INTO returns 0 rows raw_transactions is empty — re-run generate_transactions
MLflow run not visible Check MinIO bucket mlflow-artifacts exists (created by minio-init)
ADEN answers with "no tables" OpenMetadata ingest has not run yet — trigger openmetadata_ingest_dag