Skip to main content

Traefik

Purpose

Traefik is Lumie's active ingress controller. This page is a reference document for where Traefik is configured, how it receives traffic from OCI, how app Ingress objects attach middleware, and what operators should verify when edge routing changes.

Source Paths

PathRole
lumie-infra/platform/traefik-config/argocd.yamlArgo CD application that manages the Traefik patch
lumie-infra/platform/traefik-config/helmchartconfig.yamlHelmChartConfig for the k3s-bundled Traefik addon
lumie-infra/provision/terraform/nlb_0214.tfTCP 443 passthrough from OCI NLB to workers
lumie-infra/applications/lumie/backend/manifests/strip-prefix.yamlProduction /api strip middleware
lumie-infra/applications/lumie/develop/common-values.yamlDev API ingress split and /api/grading strip middleware
lumie-infra/applications/lumie/frontend/manifests/ingress.yamlMain frontend ingress
lumie-infra/applications/lumie/frontend/manifests/www-redirect.yamlTraefik redirect middleware for www
lumie-infra/applications/lumie/frontend/manifests/custom-domain.example.yamlPer-domain custom-domain ingress contract

Public Surface

Traefik is not deployed as a standalone chart here. The active object is the k3s addon HelmChartConfig named traefik in kube-system.

SurfaceContract
Listener portsweb and websecure entrypoints on the bundled addon
External pathOCI NLB forwards TCP 443 to worker nodes
Ingress classtraefik
Route sourcesStandard Kubernetes Ingress objects plus Traefik CRD Middleware
TLS terminationAt Traefik using cert-manager-managed secrets referenced by ingress rules

Runtime Flow

K3s Addon Patch

The repo-managed Traefik configuration is currently a timeout patch:

# lumie-infra/platform/traefik-config/helmchartconfig.yaml
ports:
web:
transport:
respondingTimeouts:
readTimeout: 1200s
writeTimeout: 1200s
idleTimeout: 600s

The same values are applied to websecure. The inline comment explains the reason: slow multi-gigabyte image uploads to Zot were hitting HTTP 499 through the default Traefik timeouts.

This is cluster-wide behavior. It is not scoped to Zot alone.

Ingress And Middleware Contract

Traefik behavior is driven by app manifests, not a centralized gateway config:

  • Production backend traffic uses a Middleware that strips /api before forwarding to the monolith's /v1/** routes.
  • The dev cluster uses two ingresses on dev.lumie-infra.com, one for /api/grading to grading-svc and one for /api to lumie-backend.
  • Frontend www traffic uses a Traefik redirectRegex middleware to send www.lumie-edu.com to lumie-edu.com.
  • White-label custom domains require one explicit Ingress per domain so Traefik loads the correct TLS secret into its SNI store.

Ownership Boundaries

ConcernOwner
Node-level ingress reachabilityTerraform NLB resources
Traefik addon shape and timeoutsplatform/traefik-config
Host and path routingIndividual application ingress manifests
TLS secret issuancecert-manager and the referenced issuer
Auth and tenancy enforcementApplication code, not Traefik

lumie-infra/AGENTS.md is explicit here: there is no active API gateway in front of Traefik, and auth headers such as X-Tenant-Slug remain an application concern.

Operational Notes

  • The live cluster inspected on June 14, 2026 had a running traefik pod in kube-system and a live HelmChartConfig matching the checked-in timeout patch.
  • Dev ingress intentionally excludes the frontend app, so https://dev.lumie-infra.com/ returns 404 while /api routes work.
  • Because the main OCI NLB is layer-4 only, SNI handling and certificate selection are entirely Traefik responsibilities.

Failure Modes

Failure pointBehavior
Timeout patch removedLarge Zot pushes can fail again with client disconnects
Middleware annotation missingBackend sees /api/... instead of /v1/... and route handling breaks
Custom domain ingress omittedTraefik serves the fallback self-signed cert for that hostname
Wrong ingress classRules exist in Git but are ignored by Traefik

Verification

kubectl get helmchartconfig -n kube-system traefik -o yaml
kubectl get ingress -A
kubectl get middleware -A
kubectl logs -n kube-system deploy/traefik
rg -n "traefik|strip-api-prefix|redirectRegex|ingressClassName: traefik" \
lumie-infra/platform/traefik-config \
lumie-infra/applications/lumie