LLM RBAC — Role-Based Access Control for AI models¶

AKKO maps the AKKO platform roles (akko-admin, akko-engineer, akko-analyst, akko-user, akko-viewer) to LiteLLM model access groups and quotas, centralising the AI access matrix and exposing it via the React cockpit under LLM roles (route /x/llm-roles, feature llm-governance).

Default matrix¶

Role	Chat (qwen2.5:3b)	Embed (nomic)	Code (qwen-coder)	GPU (vLLM)	Daily quota	Rate limit
`akko-admin`	✅	✅	✅	✅	unlimited	unlimited
`akko-engineer`	✅	✅	✅	❌	50 k tokens	120 req/min
`akko-analyst`	✅	✅	❌	❌	20 k tokens	60 req/min
`akko-user`	✅	❌	❌	❌	5 k tokens	30 req/min
`akko-viewer`	❌	❌	❌	❌	none	—

Source of truth: global.llmRbac in helm/akko/values.yaml (tiers + models), rendered into the ConfigMap akko-llm-rbac in namespace akko and mounted by the cockpit backend.

Architecture¶

React cockpit (LLM roles page /x/llm-roles)
  │  fetch, credentials: include
  ▼
Cockpit backend (routes_llm_governance.py)
  GET    /api/governance/llm-access/models    → models + backend + served flag
  GET    /api/governance/llm-access/policies   → groups × models matrix
  POST   /api/governance/llm-access/policies    → create a group policy
  PUT    /api/governance/llm-access/policies/{group} → update
  DELETE /api/governance/llm-access/policies/{group} → remove
  POST   /api/governance/llm-access/mint-key    → scoped virtual key
  │
  ▼
LiteLLM gateway — enforces access_groups per model at call time

The backend reads the baseline matrix from the mounted akko-llm-rbac ConfigMap (rendered from global.llmRbac). Every read/write is SSO-gated through the platform ForwardAuth and audited.

File layout¶

File	Role
`helm/akko/values.yaml` (`global.llmRbac`)	Source of truth (tiers + models)
`helm/akko/charts/akko-cockpit-backend/templates/deployment.yaml`	Mounts ConfigMap `akko-llm-rbac` at `/etc/akko/llm-rbac/rbac.json`
`branding/cockpit-react/src/features/llm-governance/`	Matrix UI (page + i18n + gateway)
`branding/cockpit-react/src/features/llm-governance/data/httpGateway.ts`	Calls `/api/governance/llm-access/*`

Using the matrix from code¶

From a notebook (jupyter-alice)¶

import os
import requests
# LiteLLM respects the access_groups based on the bearer token user
resp = requests.post(
    f"http://akko-akko-litellm:4000/chat/completions",
    headers={"Authorization": f"Bearer {os.environ['LITELLM_USER_KEY']}"},
    json={"model": "akko-chat", "messages": [{"role": "user", "content": "hi"}]}
)

If the user's role doesn't match the model's access_groups, LiteLLM returns 403 Forbidden with a message indicating the missing group.

From Trino (via `ai_*()` functions)¶

-- The ai_service middleware passes the Trino user's Keycloak role to LiteLLM.
-- A user with role akko-viewer gets an empty result (no model access).
SELECT akko_ai_sentiment(comment) FROM reviews LIMIT 10;

From Claude Desktop (via MCP)¶

MCP Trino server propagates the user's role via HTTP header. The LiteLLM gateway enforces it the same way.

Modifying the matrix¶

Via the cockpit (admin)¶

Sign in as alice_admin (akko-admin)
Open LLM roles (route /x/llm-roles)
The full matrix renders: models × groups, with quotas and rate limits
Create, update, or remove a group policy inline; changes are written through the backend (POST/PUT/DELETE /api/governance/llm-access/policies) and applied to the LiteLLM gateway

Via GitOps (baseline defaults)¶

Edit global.llmRbac in helm/akko/values.yaml (or an overlay) → commit → push → deploy. This sets the baseline matrix that ships with the platform; per-group edits made in the cockpit are layered on top and audited.

Adding a new model¶

Edit the LiteLLM model list (akko-litellm.config.model_list in your values overlay)
Add the model to the models section of global.llmRbac with its default access_groups
Commit + push → deploy
LiteLLM picks up the new model automatically; the matrix shows the new column

Auditing LLM usage¶

Every LLM call is logged in logs layer by the AI Service middleware:

{service="ai-service"} |= "llm_call" | json | line_format
"{{.user}} {{.model}} {{.status}} tokens={{.tokens}}"

Dashboard Dashboards LLM Usage by Role (planned Sprint 15.7) shows: - Tokens/hour per role - Top 10 users by volume - Error rate per model - Cost estimate (when commercial models are added)

Emergency lockout¶

To immediately deny all LLM access for a group, open LLM roles (/x/llm-roles) and set that group's model policy to none, or DELETE its policy via the backend. The change reaches the LiteLLM gateway without touching Keycloak or redeploying services. Persist it afterwards in global.llmRbac so the lockout survives a redeploy.