KEDA
Purpose
KEDA provides autoscaling for Lumie workloads that scale on CPU or RabbitMQ queue depth. The platform layer installs the KEDA operator, while individual applications own their own ScaledObject manifests.
This page is a reference document for developers changing autoscaling triggers, trigger auth, or queue-depth scaling thresholds.
Source Paths
| Path | Role |
|---|---|
lumie-infra/platform/keda/argocd.yaml | Argo CD application for KEDA |
lumie-infra/platform/keda/helm-values.yaml | KEDA operator, metrics server, and webhook images and resources |
lumie-infra/applications/lumie/backend/manifests/scaled-object.yaml | Backend CPU autoscaling |
lumie-infra/applications/lumie/frontend/manifests/scaled-object.yaml | Frontend CPU autoscaling |
lumie-infra/applications/lumie/worker/grading-svc/manifests/{scaled-object.yaml,trigger-auth.yaml} | Grading queue-depth scaling and shared RabbitMQ trigger auth |
lumie-infra/applications/lumie/worker/report-svc/manifests/scaled-object.yaml | Report queue-depth scaling |
Public Surface
| Surface | Namespace | Notes |
|---|---|---|
| KEDA operator | keda-system | Reconciles ScaledObject and TriggerAuthentication resources |
| Metrics API server | keda-system | Feeds external metrics to HPA |
| Admission webhooks | keda-system | Validate and mutate KEDA resources |
ScaledObject CRs | App namespaces | Define the real autoscaling contracts |
Runtime Flow
Active ScaledObjects
| Workload | Namespace | Trigger | Min | Max | Source |
|---|---|---|---|---|---|
lumie-backend | lumie-backend | CPU utilization 70 | 2 | 5 | applications/lumie/backend/manifests/scaled-object.yaml |
lumie-frontend | lumie-frontend | CPU utilization 70 | 2 | 5 | applications/lumie/frontend/manifests/scaled-object.yaml |
grading-svc | lumie-worker | RabbitMQ queue grading.omr-request, queue length 20 | 4 | 8 | applications/lumie/worker/grading-svc/manifests/scaled-object.yaml |
report-svc | lumie-worker | RabbitMQ queue report.generation-request, queue length 40 | 4 | 5 | applications/lumie/worker/report-svc/manifests/scaled-object.yaml |
The backend ScaledObject also defines explicit scale-up stabilization so JVM warmup spikes do not immediately drive the workload to maximum replicas.
Queue Trigger Auth
The inspected Git-managed auth resource is:
kind: TriggerAuthentication
metadata:
name: rabbitmq-auth
spec:
secretTargetRef:
- parameter: host
name: rabbitmq-connection
key: host
Live cluster inspection on June 14, 2026 showed:
rabbitmq-authexists in namespacelumie-worker;- its status lists both
grading-svcandreport-svcas consumers; - the referenced Secret
rabbitmq-connectionalso exists inlumie-worker.
Ownership Boundaries
| Responsibility | Owner |
|---|---|
| KEDA controller install | platform/keda/** |
| Per-workload scaling thresholds | Each workload's ScaledObject manifest |
| RabbitMQ host credential used by queue triggers | Shared rabbitmq-auth plus rabbitmq-connection Secret in lumie-worker |
Contract Drift
Inspected repo and live state do not line up perfectly:
| Source | Claim |
|---|---|
applications/lumie/worker/grading-svc/kustomization.yaml | Includes manifests/trigger-auth.yaml |
applications/lumie/worker/report-svc/kustomization.yaml | Does not include its own TriggerAuthentication, even though report-svc references rabbitmq-auth |
| Live cluster on June 14, 2026 | rabbitmq-auth.status.scaledobjects includes both grading-svc and report-svc |
| Inspected repo tree | No Git-managed source for the lumie-worker/rabbitmq-connection Secret was found |
Treat rabbitmq-auth as a shared queue-scaling dependency today, and do not assume the rabbitmq-connection Secret is fully declared in lumie-infra until its source is added or documented.
Verification
cd lumie-infra
rg -n "kind: ScaledObject|TriggerAuthentication|rabbitmq-auth|queueName|metricType: Utilization" \
platform/keda applications/lumie
kubectl get scaledobject,triggerauthentication -A
kubectl get triggerauthentication rabbitmq-auth -n lumie-worker -o yaml