Skip to content

Healthcare — Cohort analysis walkthrough

Persona path: carol (analyst) → eve (steward) · Catalogs: postgres_oltp_publicsector (schema healthcare) · Duration: ~25 min · Difficulty: star star

This demo shows how a clinical analyst can size a patient cohort in natural language. The data is synthetic — generated with the same INSEE-style probabilistic model as the public sector demo. No real personal health information is used.

What this proves

  • ADEN can navigate a sensitive domain when the Catalog declares it correctly.
  • Sensitive columns (patient_name, national_id) are masked by default for analysts.
  • A cohort query produces both a count and an exportable list of pseudonymous IDs.
  • The steward signs off the enrichment before any downstream usage.

Pre-requisites

  • Demo URL: https://demo.akko-ai.com
  • Catalog postgres_oltp_publicsector already federated.
  • Schema healthcare seeded with 25 000 synthetic patients (script scripts/seed-healthcare.sh).
  • 2 personas provisioned: carol, eve.

Step 1 — Carol signs in

Open https://demo.akko-ai.com, click Sign in, enter:

  • Username: carol
  • Password: read from the Secret kubectl get secret -n akko akko-demo-personas -o jsonpath='{.data.carol}' | base64 -d

Expected: Cockpit Home shows 20/20 services healthy and a analyst badge.

Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/01-carol-home.png

Step 2 — Carol explores the healthcare schema in the Catalog

Click the Governance tab, then Catalog explorer. Navigate to postgres_oltp_publicsector → healthcare.

Expected tables:

healthcare.patients         (25 000 rows)
healthcare.encounters       (180 000 rows)
healthcare.diagnoses        (240 000 rows)
healthcare.prescriptions    (310 000 rows)
healthcare.lab_results      (450 000 rows)

Click patients and verify the column glossary:

  • patient_id — pseudonymous ULID
  • birth_year — integer
  • sexM or F
  • region_insee_code — 2 char department code
  • patient_namePII.Sensitive (masked for akko-analyst)
  • national_idPII.Sensitive (masked for akko-analyst)

Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/02-catalog-patients.png

Step 3 — Carol opens ADEN

Click the AI tab, then ADEN — natural language to SQL.

Set:

  • Catalog: postgres_oltp_publicsector
  • Schema scope: healthcare
  • Model: qwen2.5-coder:7b

Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/03-aden-scope.png

Step 4 — Carol asks the cohort question

In the prompt box, type:

find diabetic patients aged 50 to 65 with at least two comorbidities,
return cohort size and the breakdown by sex and region

Click Ask. Expected: ADEN proposes a multi-CTE SQL similar to:

WITH diabetes AS (
    SELECT DISTINCT d.patient_id
    FROM postgres_oltp_publicsector.healthcare.diagnoses d
    WHERE d.icd10_code LIKE 'E11%'
), comorbids AS (
    SELECT d.patient_id, count(DISTINCT d.icd10_code) AS n_comorbids
    FROM postgres_oltp_publicsector.healthcare.diagnoses d
    WHERE d.icd10_code NOT LIKE 'E11%'
    GROUP BY d.patient_id
    HAVING count(DISTINCT d.icd10_code) >= 2
), cohort AS (
    SELECT p.patient_id, p.sex, p.region_insee_code
    FROM postgres_oltp_publicsector.healthcare.patients p
    JOIN diabetes USING (patient_id)
    JOIN comorbids USING (patient_id)
    WHERE date_diff('year', date_from_parts(p.birth_year, 1, 1), current_date)
          BETWEEN 50 AND 65
)
SELECT sex, region_insee_code, count(*) AS n_patients
FROM cohort
GROUP BY sex, region_insee_code
ORDER BY n_patients DESC;

Expected result summary panel:

Cohort size: 1 247 patients
Sex split:   F 651 / M 596
Top regions: 13 (87) / 75 (74) / 33 (61) / 59 (58) / 69 (54)

Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/04-aden-cohort.png

Step 5 — Carol exports the pseudonymous list

In the ADEN result toolbar, click Export → Pseudonymous IDs (CSV).

Expected: a CSV with patient_id (ULID) and sex only. Columns patient_name, national_id and birth_year are stripped by the Governance layer because Carol is akko-analyst.

patient_id,sex
01HZW...A1,F
01HZW...B2,M
...

Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/05-export-csv.png

Step 6 — Carol promotes the cohort to a BI dashboard

Click Promote to dashboard. Expected: 5 charts auto-built:

  1. KPI — Cohort size (1 247)
  2. Pie — Sex split
  3. Bar — Top 10 regions
  4. Histogram — Age distribution
  5. Table — Breakdown by sex and region
+-------------------- AKKO Healthcare — Diabetes cohort ---------------+
| 1 247 patients | F 52% / M 48%                                       |
+----------------------------------------------------------------------+
| [ Pie sex ]  [ Bar top regions ]  [ Histogram ages 50-65 ]           |
+----------------------------------------------------------------------+

Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/06-bi-cohort.png

Step 7 — Eve reviews the catalog enrichment

Sign out, sign in as Eve.

Open Governance → NORA. Expected: NORA shows pending reviews on healthcare.diagnoses.icd10_code and healthcare.patients.region_insee_code.

Approve the suggested glossary term ICD-10 Code and the suggested owner bob.engineer@akko-ai.com. Reject the suggested description on national_id if it leaks examples; rewrite as Pseudonymous national identifier - synthetic only.

Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/07-nora-review.png

Cleanup

  • Sign out.
  • Optional: drop the exported CSV from your laptop after the demo.

What this proves

  • Sensitive columns stay masked end-to-end for analysts.
  • Pseudonymous IDs are the only data ever leaving the Query layer for cohort exports.
  • NORA gives the steward a clear approval queue with audit trail.
  • The BI dashboard is created in one click after ADEN converges on the SQL.

Files in the repo

File Role
scripts/seed-healthcare.sh Seeds 25 000 synthetic patients into postgres_oltp_publicsector.healthcare
dbt/models/marts/healthcare/cohort_diabetes.sql Materialized view backing the dashboard
keycloak/realm-akko.json Personas carol and eve wired to roles