Skip to content

PostgreSQL

Overview

AKKO deploys two separate PostgreSQL instances to enforce a strict separation between infrastructure metadata and business data. Both instances run the custom akko-postgres image, which bundles PostGIS (geospatial) and pgvector (AI embeddings) extensions.

Architecture

  ┌─────────────────────────────────┐
  │  akko-postgresql (infra)        │
  │  Databases:                     │
  │    keycloak, airflow, superset, │
  │    polaris, openmetadata,       │
  │    mlflow, jupyterhub, litellm  │
  └─────────────────────────────────┘

  ┌─────────────────────────────────┐
  │  akko-postgresql-data (business)│
  │  Databases:                     │
  │    analytics, geospatial, rag   │
  │  Extensions:                    │
  │    PostGIS, pgvector            │
  └─────────────────────────────────┘

Two PostgreSQL Instances — Mandatory

Never mix infrastructure and business data in the same server. This separation enables independent backup/restore, scaling, and security policies. Infrastructure databases hold service state (sessions, DAGs, dashboards); business databases hold user-facing analytics data.

Ports

Instance Port Purpose Exposed
akko-postgresql 5432 Infrastructure metadata databases Internal only
akko-postgresql-data 5432 Business/analytics databases Internal only (exposed to Trino for federation)

Databases

akko-postgresql (Infrastructure)

Database Used by Purpose
keycloak Keycloak SSO realm, users, sessions
airflow Airflow DAG metadata, task state, connections
superset Superset Dashboards, charts, datasets, users
polaris Polaris Iceberg catalog metadata
openmetadata OpenMetadata Data governance catalog
mlflow MLflow Experiment tracking, model registry
jupyterhub JupyterHub User state, spawner data
litellm LiteLLM AI gateway configuration, usage logs

akko-postgresql-data (Business)

Database Extensions Purpose
analytics Business intelligence data, banking demo tables
geospatial PostGIS Geographic data, spatial queries
rag pgvector Vector embeddings for RAG pipelines

Extensions

PostGIS

PostGIS adds geospatial types (geometry, geography) and functions (ST_Distance, ST_Within, etc.) for spatial analytics. Enabled in the geospatial database.

pgvector

pgvector adds the vector type and operators (<-> cosine distance, <#> inner product) for AI embedding storage and similarity search. Enabled in the rag database.

postgres/init/05-pgvector.sql
CREATE EXTENSION IF NOT EXISTS vector;
CREATE TABLE IF NOT EXISTS embeddings (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding vector(768),
    metadata JSONB
);

Configuration

Custom Image

docker/postgres/Dockerfile
FROM postgres:16.6
RUN apt-get update && apt-get install -y \
    postgresql-16-postgis-3 \
    postgresql-16-pgvector

The image is built as akko-postgres:2026.03.

Init Scripts

Database initialization follows two mechanisms:

  1. postgres/init/ scripts — run once when the data volume is empty (standard docker-entrypoint-initdb.d behavior)
  2. ensure.sql sidecar — runs on every deployment to guarantee extensions, schemas, and tables exist (idempotent, survives restarts)

First-time vs Every-time

Scripts in postgres/init/ only run on first startup (empty volume). For anything that must always exist (extensions, schemas), use the ensure.sql sidecar approach which runs on every helm upgrade.

Helm Chart

PostgreSQL is deployed via the akko-postgres custom sub-chart (replaces Bitnami to avoid version conflicts and ensure PostGIS/pgvector support):

helm/akko/charts/akko-postgres/
├── Chart.yaml
├── values.yaml
├── files/
│   ├── ensure.sql
│   └── init-scripts/
└── templates/
    ├── statefulset.yaml
    ├── service.yaml
    ├── secret.yaml
    └── configmap.yaml

Key Values

values.yaml
akko-postgres:
  enabled: true
  image:
    repository: akko-postgres
    tag: "2026.03"
  persistence:
    enabled: true
    size: 20Gi
  resources:
    requests:
      cpu: 250m
      memory: 512Mi
    limits:
      cpu: "1"
      memory: 1Gi

Troubleshooting

Common Issues

  • Password mismatch after secret rotation: PostgreSQL persists passwords in its data volume. If you change the secret in Kubernetes, you must also run ALTER USER ... PASSWORD '...' inside the running database. Simply restarting the pod is not enough.
  • Extension not found (pgvector or PostGIS): Ensure you are using the custom akko-postgres:2026.03 image, not the vanilla postgres image. Check with kubectl exec statefulset/akko-postgresql-data -- psql -c "SELECT * FROM pg_available_extensions WHERE name IN ('vector','postgis')".
  • Connection refused from services: Verify the service DNS name (akko-postgresql for infra, akko-postgresql-data for business) and port (5432). Check that the StatefulSet pod is running and the readiness probe passes.