akko-rag — intelligence documentaire RAG¶

Service de retrieval d'AKKO. Les clients uploadent des documents dans des collections ; akko-rag extrait le texte, le chunk, calcule les embeddings 768 dim via la passerelle LiteLLM souveraine, et stocke le tout dans pgvector côté akko-postgresql-data. Les appelants (ADEN, cockpit, agent, notebook) posent ensuite des questions en langage naturel sur une collection et reçoivent des chunks classés avec la provenance complète.

Phase 0 — tier 1 seulement

Le service est prévu pour grandir en trois phases derrière le même contrat. Phase 0 (livrée) : pgvector pour des collections jusqu'à ~1 k documents. Phases 2-3 : OpenSearch hybride et Spark/Iceberg pour le big data, sans casser aucun appelant. Plan complet : rag-document-intelligence.

Architecture¶

flowchart LR
    subgraph Clients
        CK[Cockpit /#rag]
        ADEN[ADEN]
        NB[Notebook]
        AG[Agent]
    end
    subgraph RAG["akko-rag (FastAPI)"]
        API[/collections<br/>/documents<br/>/query/]
        EX[Extractor<br/>pypdf / txt / md]
        CHK[Chunker<br/>fenêtre de mots]
        EMB[Embedder<br/>retry 3x]
    end
    subgraph Backends
        PG[(akko-postgresql-data<br/>schéma akko_rag<br/>pgvector HNSW + GIN)]
        LL[akko-litellm<br/>embed + chat]
    end
    Clients --> API
    API --> EX --> CHK --> EMB --> LL
    EMB --> PG
    API <--> PG

API (Phase 0)¶

Méthode	Chemin	Rôle
GET	`/health`	liveness
GET	`/ready`	joignabilité de la base
GET	`/metrics`	compteurs + histogramme Prometheus
POST	`/collections`	créer une collection avec `allowed_roles`
GET	`/collections`	lister (avec compte documents + chunks)
POST	`/collections/{slug}/documents`	upload → extract → chunk → embed → store
GET	`/collections/{slug}/documents`	lister les documents
POST	`/collections/{slug}/query`	recherche top-k par similarité cosinus
GET	`/audit/queries?limit=N`	retrieval récents (trace d'audit)

Identité : confiance dans l'en-tête X-Trino-User (fallback X-User-Id) — même convention qu'ADEN et le plugin Trino AI. Phase 1 remplace par la vérification JWT Keycloak.

Exemple¶

# 1. créer une collection
curl -s -X POST https://rag.akko-ai.com/collections \
     -H "X-Trino-User: alice" -H "Content-Type: application/json" \
     -d '{"slug":"kb","name":"Base de connaissances",
          "allowed_roles":["akko-admin","akko-engineer"]}'

# 2. uploader un PDF
curl -s -X POST https://rag.akko-ai.com/collections/kb/documents \
     -H "X-Trino-User: alice" \
     -F "file=@/chemin/vers/politique.pdf"

# 3. interroger
curl -s -X POST https://rag.akko-ai.com/collections/kb/query \
     -H "X-Trino-User: alice" -H "Content-Type: application/json" \
     -d '{"question":"comment rembourser une transaction ?","top_k":5}'

Réponse :

{
  "question": "comment rembourser une transaction ?",
  "collection": "kb",
  "chunks": [
    {
      "chunk_id": "…",
      "document_id": "…",
      "filename": "politique.pdf",
      "page": 4,
      "text": "Les remboursements sont émis sous 14 jours …",
      "score": 0.87
    }
  ],
  "latency_ms": 38
}

Configuration¶

Chaque paramètre se résout depuis AKKO_<NAME> ou <NAME>, pour que values-dev, values-netcup et les runs docker locaux partagent les mêmes défauts. Liste complète :

Variable	Défaut	Rôle
`AKKO_PG_HOST`	`akko-postgresql-data`	Hôte PostgreSQL
`AKKO_PG_PORT`	`5432`
`AKKO_PG_DATABASE`	`akko`
`AKKO_PG_USER`	`akko`
`AKKO_PG_PASSWORD`	(Secret)	depuis Secret `akko-postgresql-data`
`AKKO_EMBED_URL`	`http://akko-akko-litellm:4000`	LiteLLM OpenAI-compat
`AKKO_EMBED_MODEL`	`akko-embed`	alias de routage LiteLLM
`AKKO_EMBED_DIM`	`768`	correspond à `nomic-embed-text`
`AKKO_CHAT_URL`	`http://akko-akko-litellm:4000`	génération réponse Phase 1
`AKKO_CHAT_MODEL`	`akko-chat`	Phase 1
`AKKO_CHUNK_SIZE_TOKENS`	`400`	fenêtre en mots
`AKKO_CHUNK_OVERLAP_TOKENS`	`60`	recouvrement
`AKKO_MAX_UPLOAD_BYTES`	`52428800`	plafond 50 Mio
`AKKO_DEFAULT_TOP_K`	`5`	top-k par défaut
`AKKO_MAX_TOP_K`	`50`	plafond dur

Activer le service¶

Désactivé par défaut dans l'umbrella chart. On l'active :

# values-dev.yaml (ou values-netcup.yaml)
akko-rag:
  enabled: true

Le schéma est provisionné par postgres/init/11-akko-rag.sql au premier boot d'akko-postgresql-data. Sur un cluster déjà initialisé, ré-appliquer le fichier à la main contre akko-postgresql-data :

kubectl -n akko exec deploy/akko-akko-postgresql-data -- \
    psql -U akko -d akko -f /docker-entrypoint-initdb.d/11-akko-rag.sql

Phases à venir¶

Phase	Livre	Backend
0 (livrée)	Ingest + chunk + embed + top-k pgvector	pgvector HNSW
1	Réponse chat avec citations, auth JWT, UI cockpit, filtrage OPA	pgvector
2	Recherche hybride BM25 + dense, DAG watchfolder, HPA, ServiceMonitor	OpenSearch knn
3	UDF embed Spark, colonne VECTOR Iceberg, SQL hybride Trino	Spark + Iceberg

Voir aussi¶

Architecture / Données unifiées + IA
IA / Fonctions Trino AI — akko_ai_search couvre le retrieval intra-catalogue ; akko-rag couvre le retrieval documentaire avec citations et audit
Services / AI Service — service frère pour la génération texte/multimodale