Docling — Document Processing¶

Docling provides PDF/DOCX/HTML document conversion to structured text (Markdown, JSON) for RAG pipelines and document analysis in AKKO. It uses CPU-based layout analysis and OCR to extract text, tables, and figures from complex documents — fully offline, no external API calls.

Architecture¶

ai-service / ADEN / Cockpit
            |
      +-----v------+
      |   Docling   |  REST API (port 5001)
      | (Document   |  PDF/DOCX/HTML -> Markdown/JSON
      |  Processing)|
      +------------+

Supported Formats¶

Input Format	Description
PDF	Native text + OCR for scanned pages
DOCX	Microsoft Word documents
PPTX	Microsoft PowerPoint presentations
HTML	Web pages
Images	PNG, JPEG, TIFF (via OCR)
Markdown	Pass-through with normalization

Usage¶

From ai-service (RAG Pipeline)¶

import httpx

response = httpx.post(
    "http://akko-akko-docling:5001/v1/convert",
    files={"file": ("report.pdf", open("report.pdf", "rb"), "application/pdf")},
    data={"output_format": "markdown"}
)
structured_text = response.json()["content"]

From Notebooks¶

import requests

# Convert a PDF to structured Markdown
with open("document.pdf", "rb") as f:
    resp = requests.post(
        "http://akko-akko-docling:5001/v1/convert",
        files={"file": ("document.pdf", f, "application/pdf")},
        data={"output_format": "markdown"}
    )
print(resp.json()["content"])

Health Check¶

curl http://akko-akko-docling:5001/health

Configuration¶

Kubernetes (Helm)¶

akko-docling:
  enabled: true
  image:
    repository: quay.io/docling-project/docling-serve-cpu
    tag: "latest"  # Pin to a specific version in production
  resources:
    requests:
      cpu: 250m
      memory: 512Mi
    limits:
      cpu: "2"
      memory: 4Gi

Memory Requirement

Docling loads ML models for layout analysis and OCR. It requires at least 512 Mi at startup and can peak at 4 Gi when processing large documents with many pages.

Network Access¶

Docling is an internal service with no internet access. It processes documents locally using CPU-based models. The NetworkPolicy restricts:

Ingress: Only ai-service, ADEN, and cockpit can reach port 5001
Egress: DNS only (no internet access)

Troubleshooting¶

Docling Pod CrashLoopBackOff (OOMKilled)¶

Symptoms: The Docling pod enters CrashLoopBackOff status. kubectl describe pod shows OOMKilled as the last termination reason.

Cause: Docling loads several ML models at startup (layout analysis, table structure recognition, OCR). If memory is too low, the kernel OOM-killer terminates the process.

Solution:

# Check current memory limits
kubectl get pod -n akko -l app.kubernetes.io/name=akko-docling -o jsonpath='{.items[0].spec.containers[0].resources}'

# Ensure memory limit is at least 4Gi
helm upgrade akko helm/akko/ -n akko -f helm/examples/values-dev.yaml \
  --set akko-docling.resources.limits.memory=4Gi

Slow Document Processing¶

Symptoms: Document conversion takes 30+ seconds per page. CPU usage is at 100%.

Cause: Docling uses CPU-based OCR and layout analysis. Complex documents with many images, tables, or scanned pages require more processing time.

Solution:

# Check CPU allocation
kubectl top pod -n akko -l app.kubernetes.io/name=akko-docling

# Increase CPU limits for faster processing
helm upgrade akko helm/akko/ -n akko -f helm/examples/values-dev.yaml \
  --set akko-docling.resources.limits.cpu=4

Health Endpoint Returns 503¶

Symptoms: The /health endpoint returns 503 Service Unavailable. The pod is running but not ready.

Cause: Docling is still loading ML models. The readiness probe has not passed yet.

Solution:

# Check pod events and readiness probe status
kubectl describe pod -n akko -l app.kubernetes.io/name=akko-docling

# Check logs for model loading progress
kubectl logs -n akko -l app.kubernetes.io/name=akko-docling --tail=50