Traefik
Purpose
Traefik is Lumie's active ingress controller. This page is a reference document for where Traefik is configured, how it receives traffic from OCI, how app Ingress objects attach middleware, and what operators should verify when edge routing changes.
Source Paths
| Path | Role |
|---|---|
lumie-infra/platform/traefik-config/argocd.yaml | Argo CD application that manages the Traefik patch |
lumie-infra/platform/traefik-config/helmchartconfig.yaml | HelmChartConfig for the k3s-bundled Traefik addon |
lumie-infra/provision/terraform/nlb_0214.tf | TCP 443 passthrough from OCI NLB to workers |
lumie-infra/applications/lumie/backend/manifests/strip-prefix.yaml | Production /api strip middleware |
lumie-infra/applications/lumie/develop/common-values.yaml | Dev API ingress split and /api/grading strip middleware |
lumie-infra/applications/lumie/frontend/manifests/ingress.yaml | Main frontend ingress |
lumie-infra/applications/lumie/frontend/manifests/www-redirect.yaml | Traefik redirect middleware for www |
lumie-infra/applications/lumie/frontend/manifests/custom-domain.example.yaml | Per-domain custom-domain ingress contract |
Public Surface
Traefik is not deployed as a standalone chart here. The active object is the k3s addon HelmChartConfig named traefik in kube-system.
| Surface | Contract |
|---|---|
| Listener ports | web and websecure entrypoints on the bundled addon |
| External path | OCI NLB forwards TCP 443 to worker nodes |
| Ingress class | traefik |
| Route sources | Standard Kubernetes Ingress objects plus Traefik CRD Middleware |
| TLS termination | At Traefik using cert-manager-managed secrets referenced by ingress rules |
Runtime Flow
K3s Addon Patch
The repo-managed Traefik configuration is currently a timeout patch:
# lumie-infra/platform/traefik-config/helmchartconfig.yaml
ports:
web:
transport:
respondingTimeouts:
readTimeout: 1200s
writeTimeout: 1200s
idleTimeout: 600s
The same values are applied to websecure. The inline comment explains the reason: slow multi-gigabyte image uploads to Zot were hitting HTTP 499 through the default Traefik timeouts.
This is cluster-wide behavior. It is not scoped to Zot alone.
Ingress And Middleware Contract
Traefik behavior is driven by app manifests, not a centralized gateway config:
- Production backend traffic uses a
Middlewarethat strips/apibefore forwarding to the monolith's/v1/**routes. - The dev cluster uses two ingresses on
dev.lumie-infra.com, one for/api/gradingtograding-svcand one for/apitolumie-backend. - Frontend
wwwtraffic uses a TraefikredirectRegexmiddleware to sendwww.lumie-edu.comtolumie-edu.com. - White-label custom domains require one explicit
Ingressper domain so Traefik loads the correct TLS secret into its SNI store.
Ownership Boundaries
| Concern | Owner |
|---|---|
| Node-level ingress reachability | Terraform NLB resources |
| Traefik addon shape and timeouts | platform/traefik-config |
| Host and path routing | Individual application ingress manifests |
| TLS secret issuance | cert-manager and the referenced issuer |
| Auth and tenancy enforcement | Application code, not Traefik |
lumie-infra/AGENTS.md is explicit here: there is no active API gateway in front of Traefik, and auth headers such as X-Tenant-Slug remain an application concern.
Operational Notes
- The live cluster inspected on June 14, 2026 had a running
traefikpod inkube-systemand a liveHelmChartConfigmatching the checked-in timeout patch. - Dev ingress intentionally excludes the frontend app, so
https://dev.lumie-infra.com/returns404while/apiroutes work. - Because the main OCI NLB is layer-4 only, SNI handling and certificate selection are entirely Traefik responsibilities.
Failure Modes
| Failure point | Behavior |
|---|---|
| Timeout patch removed | Large Zot pushes can fail again with client disconnects |
| Middleware annotation missing | Backend sees /api/... instead of /v1/... and route handling breaks |
| Custom domain ingress omitted | Traefik serves the fallback self-signed cert for that hostname |
| Wrong ingress class | Rules exist in Git but are ignored by Traefik |
Verification
kubectl get helmchartconfig -n kube-system traefik -o yaml
kubectl get ingress -A
kubectl get middleware -A
kubectl logs -n kube-system deploy/traefik
rg -n "traefik|strip-api-prefix|redirectRegex|ingressClassName: traefik" \
lumie-infra/platform/traefik-config \
lumie-infra/applications/lumie