NORA — catalog AI reviews + steward governance¶

NORA is the steward review layer on top of the catalog-sync daemon. The daemon proposes catalog enrichments via a local LLM (description ≤20 words, PII tags), NORA queues them as pending proposals that a steward (data owner, analyst, compliance officer) reviews in the cockpit and either accepts as-is, edits, merges with the current value, or rejects.

Accepted proposals propagate synchronously to Catalog (description / classification PATCH) and Vector store (embedding upsert in the catalog collection used by ADEN's semantic search), so the downstream AI agents see the new description in milliseconds — no 6-hour catalog-sync window to wait through.

Sprint 71-75 — bout-en-bout shipped 2026-05-14

The full chain (cockpit drawer → cockpit-backend DB → Catalog PATCH → Vector store upsert) is live on demo.akko-ai.com and validated end-to-end by the akko-e2e-tester agent (verify_nora_full_chain.py). See ADR-060 NORA rebrand + Lego discipline.

Why a steward review layer ?¶

Auto-enrichment alone is not enough for any team that has compliance obligations. AKKO's positioning is governance-first, not "LLM writes to your catalog and you trust it". NORA captures the exact moment a human takes responsibility for a metadata change :

Audit trail OCSF : every accept / reject is logged with steward identity, original LLM proposal, accepted value, and downstream write-through receipts (OM PATCH status, Vector store upsert ack).
PII gate : a PII tag proposed by the LLM never auto-applies. The steward decides which rows are sensitive, with full traceability for RGPD Art. 30 + DORA Art. 13.
Steward edit : the proposal is a starting point, not a final value. Stewards routinely combine the LLM phrasing with domain knowledge ("AML risk score [...] verified by Compliance 2026-05-14").
Rollback path : every accepted proposal is replayable from the audit table. A bad accept can be reverted without touching the warehouse.

Architecture¶

flowchart LR
  subgraph Daemon["akko-catalog-sync"]
    ENR[Enrich<br/>local LLM<br/>≤20 words + PII]
  end
  subgraph DB["catalog_sync schema (Postgres)"]
    PROP[proposals_pending]
  end
  subgraph Cockpit["cockpit /#nora"]
    LIST[Pending list<br/>+ KPI badge]
    DRAW[Drawer<br/>Current / Proposed / Editor]
    BTN["4 buttons<br/>Reject · Accept as-is<br/>Apply edit · Merge"]
  end
  subgraph Backend["cockpit-backend"]
    API["POST /api/nora/proposals/&lcub;id&rcub;/accept"]
    OM_PATCH[OM PATCH client]
    MV_UPS[Milvus upsert client]
  end
  subgraph Sinks
    OM[(OpenMetadata)]
    MV[(Milvus<br/>collection: catalog)]
    AUDIT[(audit_log<br/>OCSF)]
  end
  ENR --> PROP
  PROP --> LIST
  LIST --> DRAW --> BTN
  BTN --> API
  API --> OM_PATCH --> OM
  API --> MV_UPS --> MV
  API --> AUDIT

The steward acceptance handler in cockpit-backend does three writes in order : DB (proposals_pending.status='accepted' + decision_mode + accepted_value), Catalog PATCH (description / classification), Vector store upsert (re-embed via AI gateway (LiteLLM), replace vector in the catalog collection). If OM or Vector store fail, the DB write is still authoritative and a reconciler can replay later — the steward decision is never lost.

The four decision modes¶

The drawer exposes four buttons mapped to the catalog-sync schema's decision_mode enum :

Button	`decision_mode`	What gets persisted
Reject	`rejected`	Proposal dropped, current OM value untouched.
Accept as-is	`as_is`	LLM proposal becomes the new OM description verbatim.
Apply edit	`edited`	Steward textarea value (free-edit) becomes the new OM description.
Merge	`merged`	Custom-merged value (current + proposal + steward) becomes the new OM description. Useful when the LLM caught a missing facet but the current description has compliance-grade phrasing.

Cockpit UI¶

The /#nora page in the cockpit shows :

Table 8 columns : Asset FQN, Field (description / tag), Current, Proposed value, Confidence, Enricher, Proposed-at, Actions
KPI badge in the sidebar : pending count (e.g. NORA Reviews 6)
Drawer on row click : three sections (Current OM value / Proposed LLM value / Editable textarea) + four footer buttons.
Placeholder fallback : tables with no documentation yet render (not documented yet) in italic muted gray to make the empty state explicit rather than confusing.
Real-time pending list : after accept, the row vanishes from the pending filter and the badge decrements.

API surface¶

Backend endpoints (cockpit-backend, FastAPI) :

GET /api/nora/proposals?status=pending — list pending reviews, scoped to the steward's allowed namespaces via Policy engine (OPA) (data owners only see their domain, akko-admin sees all).
POST /api/nora/proposals/{id}/accept — body : {mode, value?, note?} where mode ∈ {as_is, edited, merged, rejected}. Returns the OM PATCH status, the Vector store upsert ack, and the audit row id.
GET /api/nora/audit — paginated audit log (steward, mode, before / after values, OM PATCH status, Vector store upsert status, timestamp).

Wiring (chart values)¶

# helm/akko/charts/akko-cockpit-backend/values.yaml
nora:
  dsnSecretRef:
    name: "akko-catalog-sync-pg"   # reuse catalog-sync's DSN
    key:  "dsn"
  schema: "catalog_sync"

openmetadata:
  baseUrl: "http://openmetadata:8585/api/v1"
  servicePrefix: "trino-akko"      # 3-segment FQN → 4-segment OM
  oidc:
    tokenUrl: "http://akko-akko-keycloak:8080/realms/akko/protocol/openid-connect/token"
    existingSecret: "akko-openmetadata-oidc"

milvus:
  uri: "http://akko-akko-milvus:19530"
  collection: "catalog"

litellm:
  baseUrl: "http://akko-akko-litellm:4000"
  embedModel: "akko-embed"
  apiKeyValue: "akko-dev-litellm-key"   # or apiKey.secretName for prod

The chart enforces imagePullPolicy: Always on the cockpit (and cockpit-backend when its image is rebuilt mid-sprint) to avoid stale layer regressions documented in gotcha_image_pull_policy_always_dev_demo.

Known limits (Sprint 71-75 ship state)¶

Accept-latency : current SLO is 5 s but cold runs measure 5–8 s (synchronous OM PATCH + Vector store upsert before the toast). Async propagation is planned (toast < 500 ms, downstream writes background) or the SLO is bumped to 10 s. See task #347 follow-up.
Reseed dependency : NORA only surfaces proposals that catalog-sync has actually emitted. On a fresh cluster (or after a warehouse wipe), run the catalog-sync DAG before opening /#nora.
Cross-tenant visibility : the Policy engine gate is per-namespace ; a steward sees only proposals whose FQN is in their allowed list. Multi-tenant scoping for proposals_pending itself (e.g. customer multi-realm) is a Phase 2 task.

akko-catalog-sync — the daemon producing NORA proposals.
ADEN — the natural-language → SQL agent that consumes the Vector store catalog collection.
OpenMetadata — the canonical catalog NORA writes to.
ADR-060 — naming + Lego discipline behind NORA.
Runbook : akko-technical-map/runbooks/nora-om-milvus-aden-end-to-end.md.