Skip to content

Public sector — Batch ingestion with audit trail

Persona path: alice (setup) → bob (engineer) → carol (analyst) → alice (audit) · Catalogs: postgres_oltp_publicsector (96 INSEE departments) · Duration: ~30 min · Difficulty: star star star

This demo walks through a full batch ingest cycle. Bob drops a CSV in the landing zone, an Orchestration DAG picks it up, a Transform layer normalizes it, the new table is registered in the Catalog, and Carol queries it. Alice closes the loop by inspecting the OCSF audit trail.

What this proves

  • A CSV batch ingest is a 5-step DAG: land → validate → transform → register → federate.
  • Every read and write is captured as an OCSF event in the audit trail.
  • The Catalog auto-registers the new table; Carol does not need a ticket.
  • The OCSF events are filterable per persona and per resource.

Pre-requisites

  • Demo URL: https://demo.akko-ai.com
  • Catalog postgres_oltp_publicsector already federated (INSEE 96 departments).
  • 3 personas provisioned: alice, bob, carol.
  • Object storage bucket landing-publicsector exists (created by akko-init).

Step 1 — Alice prepares the bucket and ingest role

Sign in as alice. Navigate to Governance → Object storage. Confirm the bucket landing-publicsector is present.

Navigate to Governance → Roles. Confirm akko-engineer has WRITE on landing-publicsector/inbox/ and READ on landing-publicsector/processed/.

Screenshot: tests/e2e/playwright/artefacts/demos/publicsector-ingest/01-alice-bucket.png

Step 2 — Bob uploads the CSV batch

Sign out, sign in as bob. Navigate to DevHub → Object browser.

Drop subventions_2026_q1.csv (8 412 rows, 7 columns) in landing-publicsector/inbox/.

Expected: an upload toast Uploaded 8412 rows in 1.2s. The Orchestration layer detects the new file within 30 s.

landing-publicsector/
  inbox/
    subventions_2026_q1.csv      <-- new
  processed/

Screenshot: tests/e2e/playwright/artefacts/demos/publicsector-ingest/02-bob-upload.png

Step 3 — Bob triggers the ingestion DAG

Navigate to DevHub → Orchestration. Locate the DAG publicsector_csv_ingest. Click Trigger.

Expected: 5 tasks succeed in order:

sense_landing      OK  (0.8s)
validate_schema    OK  (1.2s)  --> 0 rejected rows
transform_normalize OK (3.4s)  --> uppercase region codes, parse dates
register_catalog   OK  (0.9s)  --> table postgres_oltp_publicsector.public.subventions_2026_q1 registered
publish_event      OK  (0.2s)  --> OCSF event sent to audit log

Total: 6.5 s.

Screenshot: tests/e2e/playwright/artefacts/demos/publicsector-ingest/03-dag-run.png

Step 4 — Bob runs the transform

Navigate to DevHub → Transform. The DAG already invoked dbt run --select +subventions_q1_aggregated. Confirm the model is green.

Expected output:

Running with dbt=1.8.0
Found 1 model, 0 tests
1 of 1 START sql view model marts.subventions_q1_aggregated  [RUN]
1 of 1 OK   created sql view model marts.subventions_q1_aggregated  [SELECT in 0.43s]
Completed successfully

Screenshot: tests/e2e/playwright/artefacts/demos/publicsector-ingest/04-dbt-run.png

Step 5 — Carol queries the new table

Sign out, sign in as carol. Open DevHub → SQL editor.

Run:

SELECT region_code, count(*) AS n_subventions, sum(amount) AS total_eur
FROM postgres_oltp_publicsector.public.subventions_2026_q1
GROUP BY region_code
ORDER BY total_eur DESC;

Expected: 96 rows, one per department, sorted by total. Top 5:

| region | n_subventions | total_eur     |
| 75     | 412           | 18 472 000.00 |
| 13     | 289           | 12 105 000.00 |
| 69     | 267           | 11 820 000.00 |
| 33     | 251           | 10 940 000.00 |
| 59     | 244           | 10 612 000.00 |

Screenshot: tests/e2e/playwright/artefacts/demos/publicsector-ingest/05-carol-query.png

Step 6 — Alice inspects the audit trail

Sign out, sign in as alice. Navigate to Governance → Audit trail.

Filter by resource = postgres_oltp_publicsector.public.subventions_2026_q1.

Expected: 5 OCSF events visible:

| ts                  | actor  | action            | resource                         | result |
| 2026-05-17 10:42:01 | bob    | object.put        | landing-publicsector/inbox/...   | ok     |
| 2026-05-17 10:42:31 | system | dag.start         | publicsector_csv_ingest          | ok     |
| 2026-05-17 10:42:38 | system | table.register    | postgres_oltp_publicsector...    | ok     |
| 2026-05-17 10:42:39 | system | transform.run     | dbt subventions_q1_aggregated    | ok     |
| 2026-05-17 10:45:14 | carol  | query.execute     | postgres_oltp_publicsector...    | ok     |

Click any event for the raw OCSF JSON.

Screenshot: tests/e2e/playwright/artefacts/demos/publicsector-ingest/06-audit-trail.png

Cleanup

  • Optional: archive the table by triggering the DAG publicsector_csv_archive which moves the CSV to landing-publicsector/processed/.
  • Sign out.

What this proves

  • CSV batch ingest is a self-service DAG, not a ticket.
  • The Catalog stays up to date without manual registration.
  • Audit trail captures the full chain of custody, persona by persona, in OCSF.
  • Carol pulls from a federated table moments after Bob uploads.

Files in the repo

File Role
airflow/dags/publicsector_csv_ingest.py The 5-step ingestion DAG
dbt/models/marts/publicsector/subventions_q1_aggregated.sql Curated aggregation
helm/akko/charts/akko-init/templates/landing-buckets-job.yaml Creates the bucket