ADEN vs akko-rag — when to use which¶
Founder review 2026-04-26 surfaced the question : "dans ADEN y'a une partie unstructured data PDF, c'est quoi la différence avec ce qui est dans RAG ?"
Both services answer questions over documents, but they sit at different layers of the stack and serve different audiences. This page exists so a CISO/operator/prospect can read one paragraph and pick the right tool.
TL;DR¶
| Capability | ADEN unstructured | akko-rag |
|---|---|---|
| Lifetime of the document | One conversation | Persistent collection |
| Who uploads | The chatting user, mid-question | An admin, ahead of time |
| Storage backend | In-process / ephemeral | PostgreSQL (akko_rag schema) + pgvector + SeaweedFS originals |
| Re-queryable | No — gone when the chat ends | Yes — every consumer can re-query |
| Multi-user shared knowledge | No | Yes (collection-level access via allowed_roles) |
| API surface | Embedded in ADEN's /query flow |
Standalone FastAPI : /collections, /upload, /query |
| Cockpit page | Inside ADEN page | Dedicated RAG page |
| Best for | "Drop this PDF in, ask one question, move on" | "Build a corporate knowledge base, share with the team" |
When to use ADEN unstructured¶
- A user has a single PDF/CSV/DOCX in front of them, wants a quick answer, and won't need that document again.
- The document is sensitive (a customer NDA, a draft proposal) and must not persist beyond the chat.
- The conversation needs both structured SQL queries AND document context in the same flow — ADEN orchestrates Trino + the document together.
- No admin overhead — the user just drops the file in the ADEN UI.
Example session :
User : uploads
Q3-bank-loan-terms.pdfUser : "Compare these loan terms to ourcustomers.contractstable. Are any covenants violated by current customers ?" ADEN : (extracts terms from PDF context, runs SQL on Trino, joins the result, produces a dashboard)
The PDF is gone when the user closes the conversation. Audit log keeps the question + the SQL but not the PDF body.
When to use akko-rag¶
- The team will ask many questions against the same set of documents over time (a product manual, an internal wiki, regulatory texts).
- Multiple users / roles need to query the same collection —
akko-ragenforcesallowed_rolesper collection. - The collection deserves its own lifecycle (create / upload many docs / re-index / delete via the cockpit RAG page, see PR #63).
- A non-AKKO consumer (a notebook, an external service) needs to call
the same retrieval API —
akko-ragis a standalone FastAPI exposing/collections/{slug}/querywith cosine similarity over pgvector.
Example session :
Admin : creates a collection
internal-policies, uploads 30 HR policy PDFs. Admin : assignsallowed_roles: [hr, akko-admin]. Employee Bob (role hr) : asks ADEN or any RAG-aware tool "what's the parental leave policy ?", retrieval hitsinternal-policies, answer cites the PDF. Employee Eve (role analyst) : same question, retrieval finds 0 hits because Eve isn't in the allowed_roles. ADEN replies "no relevant document in the collections you can read".
The collection persists across sessions. Other consumers (a notebook, a Slack bot, an external service) can call the same API.
Implementation under the hood¶
┌──────────────────────────────────┐
│ ADEN (NL → SQL → Dashboard) │
│ docker/aden/ │
│ │
uploads file ───────▶│ /query │
(in-chat) │ ├─ Trino SQL │
│ ├─ "unstructured" handler │
│ │ └─ ephemeral text extract│
│ │ → context window │
│ └─ akko-rag client (optional)│──┐
│ for persistent collections│ │
└──────────────────────────────────┘ │
│
▼
┌──────────────────────────────────┐
│ akko-rag (standalone) │
│ docker/akko-rag/ │
│ │
uploads doc ────────▶│ POST /collections/{slug}/documents
(admin, ahead of │ └─ pdf/csv/docx → chunks → embeddings
time) │ └─ pgvector + SeaweedFS │
│ │
query ──────────────▶│ POST /collections/{slug}/query │
│ └─ vector similarity → top-k│
└──────────────────────────────────┘
ADEN unstructured¶
- Code lives in
docker/aden/next to the SQL pipeline. - The PDF/CSV/DOCX bytes get parsed in-process (
pypdf,python-docx, pandas for CSV) and the extracted text becomes part of the LLM context window for THAT question. - No persistent table holds the document. The audit log keeps the question text + the SQL ADEN generated, not the document body.
- Soft 50 MB upload limit (
max_upload_bytes).
akko-rag¶
- Standalone FastAPI service (
docker/akko-rag/). - Postgres schema
akko_ragwithcollections,documents,chunks,query_logtables (seepostgres/init/11-akko-rag.sql). - Each upload chunks the document, embeds chunks via the configured
embedder (LiteLLM by default, on-prem Ollama in air-gapped mode),
stores embeddings in
pgvector. - Originals stored in SeaweedFS bucket
akko-rag/originals/. /querydoes cosine similarity, returns top-k chunks with citation metadata (file, page).- Hard 50 MB upload limit (
maxUploadByteschart value).
When to MERGE them (and why we haven't)¶
Founder asked whether to merge the two. The current judgment is no,
keep them separate, but make ADEN's unstructured handler call
akko-rag under the hood for the persistent path.
Pros of keeping separate :
- ADEN unstructured is the right UX for "drop a file, ask one question, move on". Forcing every drop-in PDF through a collection create + name + roles + index step would kill the conversational flow.
akko-raghas consumers beyond ADEN (notebooks, external services, a future Slack/Teams bot). It needs its own API and lifecycle.- The two services have different security models : ADEN unstructured
trusts the chatting user implicitly (single session);
akko-ragenforcesallowed_rolesper collection.
The future merge is one feature : "save this conversation's PDF to a collection" — a button in ADEN that POSTs the ephemeral document to akko-rag with a chosen collection + roles. Sprint 54 follow-up, not a P0.
Quick reference — endpoints and pages¶
| Action | Where |
|---|---|
| Drop a PDF and ask one question | demo.akko-ai.com → ADEN page → upload widget |
| Create a persistent collection | demo.akko-ai.com → RAG page → "+ New collection" |
| Add a document to a collection | RAG page → select collection → upload doc |
| Delete a collection | RAG page → trash icon on the collection (PR #63) |
| Query a collection from a notebook | POST http://akko-akko-rag:8080/collections/{slug}/query |
See also¶
- ADEN service page — full ADEN reference (NL→SQL→Dashboard)
- akko-rag service page — full RAG reference (3-tier architecture, chart values)
- ADR-021 — "ADEN as the conversational front door"
- ADR-026 —
akko_ai_*SQL functions (used internally by ADEN)