Skip to content

Choosing your LLM

AKKO does not pick an LLM for you. The platform runs any model Ollama can serve, and LiteLLM exposes them via OpenAI-compatible aliases. This page documents the current pre-pulled candidates and when to pick which.

Candidates bundled on Netcup / dev

Alias in LiteLLM Underlying model Origin Size Strength
akko-chat qwen2.5-coder:7b (default) Alibaba / Qwen 4.7G Best SQL generation on the current ADEN eval set.
akko-chat-eu mistral:7b Mistral AI (France) 4.1G European-sovereign LLM; strong reasoning, weaker SQL (TBD).
akko-embed nomic-embed-text Nomic (US) 274M 768-dim embeddings for akko_ai_embed / akko_ai_search.

Add or remove candidates by editing akko-ollama.models in helm/examples/values-netcup.yaml (or your own values file). The list is also reflected in the LiteLLM alias config (helm/akko/charts/akko-litellm/values.yaml).

When to pick which

European sovereignty buyers

Regulated EU organisations — banks, public sector, defence, health — routinely reject Chinese-origin models as the default LLM in a sovereign deployment. If that applies to your buyer, switch akko-chat to Mistral, even if the SQL numbers drop slightly. The operational risk is a higher retry rate in ADEN, not a functionality loss.

SQL-heavy workloads

Qwen 2.5 Coder still leads on the AKKO ADEN eval set (SQL generation against Iceberg + Trino). If your primary use case is natural-language-to-SQL and you do not have the sovereignty constraint, keep Qwen as the akko-chat target.

General reasoning / RAG answers

Mistral 7B Instruct v0.3 is slightly stronger on open-ended answers (summarisation, Q&A, entity extraction). For the akko_ai_ask / akko_ai_summarize / akko_ai_entities family, point akko-chat at Mistral.

How to switch the default

The default alias used by ADEN and the ai_* Trino functions is akko-chat. Switching is a values override, no code change:

akko-litellm:
  config:
    model_list:
      - model_name: akko-chat
        litellm_params:
          model: "ollama/mistral:7b"   # <- was ollama/qwen2.5-coder:7b
          api_base: "http://akko-akko-ollama:11434"

Re-run helm upgrade akko ... (the standard Netcup cycle). No pod restart of ADEN or the Trino coordinator is required — LiteLLM hot-reloads the alias table.

Benchmark methodology (work in progress)

The AKKO ADEN eval set (sprint 41) contains 60 NL → SQL prompts mirroring the three reference demos (banking fraud, healthcare cohorts, retail attribution). We score:

  • SQL syntactic validity against Trino 480 (coordinator-parsed, zero execution).
  • Semantic equivalence against a hand-written reference SQL.
  • Row-level result match when executed on the seeded iceberg catalog.

Results for Qwen 2.5 Coder 7B vs Mistral 7B v0.3 vs Llama 3.1 8B will be published as the sprint 42 output. Until then, Qwen remains the default alias and Mistral is available as an opt-in via akko-chat-eu.

Internet-free / air-gapped deployments

All three candidates ship as gguf layers inside the akko-ollama init job. On a fully offline install, pull the image with the models baked in (akko-ollama:2026.04-eu if you need the Mistral bundle) and set akko-ollama.init.enabled: false. Operations after boot have zero internet dependency.