Choosing your LLM¶
AKKO does not pick an LLM for you. The platform runs any model Ollama can serve, and LiteLLM exposes them via OpenAI-compatible aliases. This page documents the current pre-pulled candidates and when to pick which.
Candidates bundled on Netcup / dev¶
| Alias in LiteLLM | Underlying model | Origin | Size | Strength |
|---|---|---|---|---|
akko-chat |
qwen2.5-coder:7b (default) |
Alibaba / Qwen | 4.7G | Best SQL generation on the current ADEN eval set. |
akko-chat-eu |
mistral:7b |
Mistral AI (France) | 4.1G | European-sovereign LLM; strong reasoning, weaker SQL (TBD). |
akko-embed |
nomic-embed-text |
Nomic (US) | 274M | 768-dim embeddings for akko_ai_embed / akko_ai_search. |
Add or remove candidates by editing akko-ollama.models in helm/examples/values-netcup.yaml
(or your own values file). The list is also reflected in the LiteLLM alias config
(helm/akko/charts/akko-litellm/values.yaml).
When to pick which¶
European sovereignty buyers
Regulated EU organisations — banks, public sector, defence, health — routinely
reject Chinese-origin models as the default LLM in a sovereign deployment. If that
applies to your buyer, switch akko-chat to Mistral, even if the SQL numbers drop
slightly. The operational risk is a higher retry rate in ADEN, not a
functionality loss.
SQL-heavy workloads
Qwen 2.5 Coder still leads on the AKKO ADEN eval set (SQL generation against
Iceberg + Trino). If your primary use case is natural-language-to-SQL and you
do not have the sovereignty constraint, keep Qwen as the akko-chat target.
General reasoning / RAG answers
Mistral 7B Instruct v0.3 is slightly stronger on open-ended answers (summarisation,
Q&A, entity extraction). For the akko_ai_ask / akko_ai_summarize / akko_ai_entities family,
point akko-chat at Mistral.
How to switch the default¶
The default alias used by ADEN and the ai_* Trino functions is akko-chat. Switching
is a values override, no code change:
akko-litellm:
config:
model_list:
- model_name: akko-chat
litellm_params:
model: "ollama/mistral:7b" # <- was ollama/qwen2.5-coder:7b
api_base: "http://akko-akko-ollama:11434"
Re-run helm upgrade akko ... (the standard Netcup cycle). No pod restart of ADEN
or the Trino coordinator is required — LiteLLM hot-reloads the alias table.
Benchmark methodology (work in progress)¶
The AKKO ADEN eval set (sprint 41) contains 60 NL → SQL prompts mirroring the three reference demos (banking fraud, healthcare cohorts, retail attribution). We score:
- SQL syntactic validity against Trino 480 (coordinator-parsed, zero execution).
- Semantic equivalence against a hand-written reference SQL.
- Row-level result match when executed on the seeded iceberg catalog.
Results for Qwen 2.5 Coder 7B vs Mistral 7B v0.3 vs Llama 3.1 8B will be published as
the sprint 42 output. Until then, Qwen remains the default alias and Mistral is
available as an opt-in via akko-chat-eu.
Internet-free / air-gapped deployments¶
All three candidates ship as gguf layers inside the akko-ollama init job. On a fully
offline install, pull the image with the models baked in (akko-ollama:2026.04-eu
if you need the Mistral bundle) and set akko-ollama.init.enabled: false. Operations
after boot have zero internet dependency.