Skip to content

Tempo — Distributed Tracing

Dashboards Tempo is the distributed tracing backend for AKKO. It receives OpenTelemetry (OTLP) spans from ADEN and other instrumented services, stores them locally or on S3, and exposes a query API that Dashboards reads via the Tempo datasource. This enables end-to-end trace visibility across the AKKO pipeline.

Architecture

ADEN / AI Service / Airflow
   |  (OTLP gRPC :4317 or HTTP :4318)
   |
+--v-----------+
|  Tempo       |
|  (:3200 query|
|   :4317 gRPC |
|   :4318 HTTP)|
+--+-----------+
   |
+--v-----------+
|  Dashboards     |
|  (Tempo      |
|   datasource)|
+--------------+
  • OTLP receivers — gRPC on port 4317 and HTTP on port 4318 for span ingestion
  • Query API on port 3200 consumed by the Dashboards Tempo datasource
  • Local filesystem storage for development; S3 backend for production
  • Apache 2.0 licensed (R27 compliant)

URLs

Tempo is an internal-only service (no ingress). Traces are visualized through Dashboards.

Endpoint Port Protocol
OTLP gRPC 4317 gRPC
OTLP HTTP 4318 HTTP
Query API 3200 HTTP

Configuration (Helm values)

akko-tempo:
  enabled: true
  image:
    repository: grafana/tempo
    tag: "2.6.1"
  replicas: 1
  service:
    otlpGrpcPort: 4317
    otlpHttpPort: 4318
    queryPort: 3200
  persistence:
    enabled: true
    size: 10Gi
  retention:
    blockRetention: 72h          # 3 days for local dev, 14-30d for production S3
  serviceMonitor:
    enabled: true
    interval: 30s
  resources:
    requests:
      cpu: 100m
      memory: 256Mi
    limits:
      cpu: 1
      memory: 1Gi

Health Check

Tempo exposes metrics and health on the query port:

livenessProbe:
  httpGet:
    path: /ready
    port: 3200
  initialDelaySeconds: 15
  periodSeconds: 30
readinessProbe:
  httpGet:
    path: /ready
    port: 3200
  initialDelaySeconds: 10
  periodSeconds: 10

RBAC (who can access)

Tempo is an internal-only service with no ingress. Access is controlled by:

  • NetworkPolicy — restricts which pods can send spans (OTLP) and query traces
  • Dashboards — traces are visualized through Dashboards (Dashboards is authenticated via Keycloak)

Sending Traces to Tempo

Services send OTLP spans to Tempo using environment variables:

OTEL_EXPORTER_OTLP_ENDPOINT=http://akko-akko-tempo:4317
OTEL_EXPORTER_OTLP_PROTOCOL=grpc
OTEL_SERVICE_NAME=<service-name>

For Python services (ADEN, AI Service):

from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

provider = TracerProvider()
exporter = OTLPSpanExporter(endpoint="http://akko-akko-tempo:4317", insecure=True)
provider.add_span_processor(BatchSpanProcessor(exporter))
trace.set_tracer_provider(provider)

Key Features

Feature Description
OTLP ingestion Receives spans via gRPC (4317) and HTTP (4318)
Dashboards integration Native Tempo datasource for trace visualization
S3 backend Production deployments store traces on object storage/S3
Configurable retention 72h default (local), tunable for production
Prometheus metrics ServiceMonitor scrapes /metrics on port 3200
Trace-to-logs Link traces to logs layer logs in Dashboards

Resource Requirements

Component Minimum RAM Recommended
Tempo 256 Mi 1 Gi

Troubleshooting

No Traces in Dashboards

Symptoms: Dashboards Tempo datasource returns no results, or shows "No traces found".

Cause: Services are not sending spans, Tempo is not receiving them, or the Datasource is misconfigured.

Solution:

# Verify Tempo pod is running and ready
kubectl get pods -n akko -l app.kubernetes.io/name=akko-tempo

# Check Tempo readiness
kubectl exec -n akko deploy/akko-akko-tempo -- wget -qO- http://localhost:3200/ready

# Verify OTLP endpoint is reachable from a service pod
kubectl exec -n akko deploy/akko-akko-aden -- curl -s http://akko-akko-tempo:3200/ready

# Check Tempo logs for ingestion errors
kubectl logs -n akko deploy/akko-akko-tempo --tail=50 | grep -i "error\|warn"

Tempo Disk Full

Symptoms: Tempo pod restarts or traces are silently dropped. Logs show disk full or WAL write failure.

Cause: The PVC is full due to high trace volume or insufficient retention cleanup.

Solution:

# Check PVC usage
kubectl exec -n akko deploy/akko-akko-tempo -- df -h /var/tempo

# Reduce retention
# In values: akko-tempo.retention.blockRetention: 24h

# Increase PVC size (if supported by StorageClass)
kubectl edit pvc -n akko akko-akko-tempo

OTLP Connection Refused

Symptoms: Services log OTLP exporter: connection refused or failed to export spans.

Cause: Tempo pod is not ready, or NetworkPolicy blocks OTLP traffic.

Solution:

# Check Tempo pod status
kubectl get pods -n akko -l app.kubernetes.io/name=akko-tempo -o wide

# Test gRPC port from the sending pod
kubectl exec -n akko deploy/akko-akko-aden -- nc -zv akko-akko-tempo 4317

# Check NetworkPolicy
kubectl get networkpolicy -n akko | grep tempo