Healthcare — Cohort analysis walkthrough¶
Persona path: carol (analyst) → eve (steward) · Catalogs:
postgres_oltp_publicsector(schemahealthcare) · Duration: ~25 min · Difficulty: star star
This demo shows how a clinical analyst can size a patient cohort in natural language. The data is synthetic — generated with the same INSEE-style probabilistic model as the public sector demo. No real personal health information is used.
What this proves¶
- ADEN can navigate a sensitive domain when the Catalog declares it correctly.
- Sensitive columns (
patient_name,national_id) are masked by default for analysts. - A cohort query produces both a count and an exportable list of pseudonymous IDs.
- The steward signs off the enrichment before any downstream usage.
Pre-requisites¶
- Demo URL:
https://demo.akko-ai.com - Catalog
postgres_oltp_publicsectoralready federated. - Schema
healthcareseeded with 25 000 synthetic patients (scriptscripts/seed-healthcare.sh). - 2 personas provisioned:
carol,eve.
Step 1 — Carol signs in¶
Open https://demo.akko-ai.com, click Sign in, enter:
- Username:
carol - Password: read from the Secret
kubectl get secret -n akko akko-demo-personas -o jsonpath='{.data.carol}' | base64 -d
Expected: Cockpit Home shows 20/20 services healthy and a analyst badge.
Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/01-carol-home.png
Step 2 — Carol explores the healthcare schema in the Catalog¶
Click the Governance tab, then Catalog explorer. Navigate to postgres_oltp_publicsector → healthcare.
Expected tables:
healthcare.patients (25 000 rows)
healthcare.encounters (180 000 rows)
healthcare.diagnoses (240 000 rows)
healthcare.prescriptions (310 000 rows)
healthcare.lab_results (450 000 rows)
Click patients and verify the column glossary:
patient_id— pseudonymous ULIDbirth_year— integersex—MorFregion_insee_code— 2 char department codepatient_name—PII.Sensitive(masked forakko-analyst)national_id—PII.Sensitive(masked forakko-analyst)
Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/02-catalog-patients.png
Step 3 — Carol opens ADEN¶
Click the AI tab, then ADEN — natural language to SQL.
Set:
- Catalog:
postgres_oltp_publicsector - Schema scope:
healthcare - Model:
qwen2.5-coder:7b
Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/03-aden-scope.png
Step 4 — Carol asks the cohort question¶
In the prompt box, type:
find diabetic patients aged 50 to 65 with at least two comorbidities,
return cohort size and the breakdown by sex and region
Click Ask. Expected: ADEN proposes a multi-CTE SQL similar to:
WITH diabetes AS (
SELECT DISTINCT d.patient_id
FROM postgres_oltp_publicsector.healthcare.diagnoses d
WHERE d.icd10_code LIKE 'E11%'
), comorbids AS (
SELECT d.patient_id, count(DISTINCT d.icd10_code) AS n_comorbids
FROM postgres_oltp_publicsector.healthcare.diagnoses d
WHERE d.icd10_code NOT LIKE 'E11%'
GROUP BY d.patient_id
HAVING count(DISTINCT d.icd10_code) >= 2
), cohort AS (
SELECT p.patient_id, p.sex, p.region_insee_code
FROM postgres_oltp_publicsector.healthcare.patients p
JOIN diabetes USING (patient_id)
JOIN comorbids USING (patient_id)
WHERE date_diff('year', date_from_parts(p.birth_year, 1, 1), current_date)
BETWEEN 50 AND 65
)
SELECT sex, region_insee_code, count(*) AS n_patients
FROM cohort
GROUP BY sex, region_insee_code
ORDER BY n_patients DESC;
Expected result summary panel:
Cohort size: 1 247 patients
Sex split: F 651 / M 596
Top regions: 13 (87) / 75 (74) / 33 (61) / 59 (58) / 69 (54)
Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/04-aden-cohort.png
Step 5 — Carol exports the pseudonymous list¶
In the ADEN result toolbar, click Export → Pseudonymous IDs (CSV).
Expected: a CSV with patient_id (ULID) and sex only. Columns patient_name, national_id and birth_year are stripped by the Governance layer because Carol is akko-analyst.
Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/05-export-csv.png
Step 6 — Carol promotes the cohort to a BI dashboard¶
Click Promote to dashboard. Expected: 5 charts auto-built:
- KPI — Cohort size (1 247)
- Pie — Sex split
- Bar — Top 10 regions
- Histogram — Age distribution
- Table — Breakdown by sex and region
+-------------------- AKKO Healthcare — Diabetes cohort ---------------+
| 1 247 patients | F 52% / M 48% |
+----------------------------------------------------------------------+
| [ Pie sex ] [ Bar top regions ] [ Histogram ages 50-65 ] |
+----------------------------------------------------------------------+
Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/06-bi-cohort.png
Step 7 — Eve reviews the catalog enrichment¶
Sign out, sign in as Eve.
Open Governance → NORA. Expected: NORA shows pending reviews on healthcare.diagnoses.icd10_code and healthcare.patients.region_insee_code.
Approve the suggested glossary term ICD-10 Code and the suggested owner bob.engineer@akko-ai.com. Reject the suggested description on national_id if it leaks examples; rewrite as Pseudonymous national identifier - synthetic only.
Screenshot: tests/e2e/playwright/artefacts/demos/healthcare-cohort/07-nora-review.png
Cleanup¶
- Sign out.
- Optional: drop the exported CSV from your laptop after the demo.
What this proves¶
- Sensitive columns stay masked end-to-end for analysts.
- Pseudonymous IDs are the only data ever leaving the Query layer for cohort exports.
- NORA gives the steward a clear approval queue with audit trail.
- The BI dashboard is created in one click after ADEN converges on the SQL.
Files in the repo¶
| File | Role |
|---|---|
scripts/seed-healthcare.sh |
Seeds 25 000 synthetic patients into postgres_oltp_publicsector.healthcare |
dbt/models/marts/healthcare/cohort_diabetes.sql |
Materialized view backing the dashboard |
keycloak/realm-akko.json |
Personas carol and eve wired to roles |