Docling — Document Processing¶
Docling provides PDF/DOCX/HTML document conversion to structured text (Markdown, JSON) for RAG pipelines and document analysis in AKKO. It uses CPU-based layout analysis and OCR to extract text, tables, and figures from complex documents — fully offline, no external API calls.
Architecture¶
ai-service / ADEN / Cockpit
|
+-----v------+
| Docling | REST API (port 5001)
| (Document | PDF/DOCX/HTML -> Markdown/JSON
| Processing)|
+------------+
Supported Formats¶
| Input Format | Description |
|---|---|
| Native text + OCR for scanned pages | |
| DOCX | Microsoft Word documents |
| PPTX | Microsoft PowerPoint presentations |
| HTML | Web pages |
| Images | PNG, JPEG, TIFF (via OCR) |
| Markdown | Pass-through with normalization |
Usage¶
From ai-service (RAG Pipeline)¶
import httpx
response = httpx.post(
"http://akko-akko-docling:5001/v1/convert",
files={"file": ("report.pdf", open("report.pdf", "rb"), "application/pdf")},
data={"output_format": "markdown"}
)
structured_text = response.json()["content"]
From Notebooks¶
import requests
# Convert a PDF to structured Markdown
with open("document.pdf", "rb") as f:
resp = requests.post(
"http://akko-akko-docling:5001/v1/convert",
files={"file": ("document.pdf", f, "application/pdf")},
data={"output_format": "markdown"}
)
print(resp.json()["content"])
Health Check¶
Configuration¶
Kubernetes (Helm)¶
akko-docling:
enabled: true
image:
repository: quay.io/docling-project/docling-serve-cpu
tag: "latest" # Pin to a specific version in production
resources:
requests:
cpu: 250m
memory: 512Mi
limits:
cpu: "2"
memory: 4Gi
Memory Requirement
Docling loads ML models for layout analysis and OCR. It requires at least 512 Mi at startup and can peak at 4 Gi when processing large documents with many pages.
Network Access¶
Docling is an internal service with no internet access. It processes documents locally using CPU-based models. The NetworkPolicy restricts:
- Ingress: Only ai-service, ADEN, and cockpit can reach port 5001
- Egress: DNS only (no internet access)
Troubleshooting¶
Docling Pod CrashLoopBackOff (OOMKilled)¶
Symptoms: The Docling pod enters CrashLoopBackOff status. kubectl describe pod shows OOMKilled as the last termination reason.
Cause: Docling loads several ML models at startup (layout analysis, table structure recognition, OCR). If memory is too low, the kernel OOM-killer terminates the process.
Solution:
# Check current memory limits
kubectl get pod -n akko -l app.kubernetes.io/name=akko-docling -o jsonpath='{.items[0].spec.containers[0].resources}'
# Ensure memory limit is at least 4Gi
helm upgrade akko helm/akko/ -n akko -f helm/examples/values-dev.yaml \
--set akko-docling.resources.limits.memory=4Gi
Slow Document Processing¶
Symptoms: Document conversion takes 30+ seconds per page. CPU usage is at 100%.
Cause: Docling uses CPU-based OCR and layout analysis. Complex documents with many images, tables, or scanned pages require more processing time.
Solution:
# Check CPU allocation
kubectl top pod -n akko -l app.kubernetes.io/name=akko-docling
# Increase CPU limits for faster processing
helm upgrade akko helm/akko/ -n akko -f helm/examples/values-dev.yaml \
--set akko-docling.resources.limits.cpu=4
Health Endpoint Returns 503¶
Symptoms: The /health endpoint returns 503 Service Unavailable. The pod is running but not ready.
Cause: Docling is still loading ML models. The readiness probe has not passed yet.
Solution: