RabbitMQ
Purpose
RabbitMQ is Lumie's event bus. The platform installs the RabbitMQ Cluster Operator and Messaging Topology Operator, defines shared exchanges and queues in platform/rabbitmq, and lets consuming applications own their own user credentials and permissions in their application directories.
This page is a reference document for developers changing queue topology, broker auth, per-service credentials, or worker and backend messaging boundaries. For autoscaling against queue depth, see KEDA.
Source Paths
| Path | Role |
|---|---|
lumie-infra/platform/rabbitmq-operator/argocd.yaml | Installs the RabbitMQ Cluster Operator and Messaging Topology Operator |
lumie-infra/platform/rabbitmq-operator/helm-values.yaml | Operator image sources and topology-operator enablement |
lumie-infra/platform/rabbitmq/base/rabbitmq-cluster.yaml | Base RabbitmqCluster spec and broker config |
lumie-infra/platform/rabbitmq/base/topology/*.yaml | Shared exchanges, queues, bindings, and policies |
lumie-infra/platform/rabbitmq/overlays/prod/kustomization.yaml | Production overlay in namespace lumie-event |
lumie-infra/platform/rabbitmq/overlays/dev/{kustomization.yaml,app-users.yaml} | Development overlay in namespace lumie-dev |
lumie-infra/applications/lumie/backend/manifests/rabbitmq-user.yaml | Backend user and permissions in consumer namespace |
lumie-infra/applications/lumie/worker/{grading-svc,report-svc}/manifests/rabbitmq-user.yaml | Worker users and scoped permissions in lumie-worker |
Public Surface
| Surface | Namespace | Notes |
|---|---|---|
| RabbitMQ operator | rabbitmq-system | Installs cluster and topology CRDs |
| Production broker | lumie-event | Central broker for backend and worker traffic |
| Development broker | lumie-dev | Separate broker and app users for the shared dev environment |
| Shared exchanges | lumie.commands, lumie.dlx | Declared as Topology Operator Exchange CRs |
| Shared queues | grading.omr-request, grading.omr-callback, report.generation-request, report.generation-callback, DLQs | Declared as Topology Operator Queue CRs |
Runtime Flow
Shared Topology Contract
| Resource | Active contract |
|---|---|
lumie.commands | Direct exchange for request and callback routing |
lumie.dlx | Topic exchange for dead letters |
grading.omr-request | Quorum queue, TTL 600000, DLX lumie.dlx |
report.generation-request | Quorum queue, TTL 1800000, DLX lumie.dlx |
grading.omr-callback | Quorum queue, TTL 600000, DLX lumie.dlx |
report.generation-callback | Quorum queue, TTL 1800000, DLX lumie.dlx |
work-queue-config policy | delivery-limit: 5 plus TTL enforcement on request and callback queues |
The broker exposes per-object metrics through platform/rabbitmq/base/podmonitor.yaml, which is required for per-queue alerting and KEDA inspection.
Ownership Boundaries
| Responsibility | Owner |
|---|---|
| Operator install and CRDs | platform/rabbitmq-operator/** |
| Shared exchanges, queues, bindings, and base broker config | platform/rabbitmq/base/** |
| Namespace wiring and cluster-reference adjustments | platform/rabbitmq/overlays/{prod,dev} |
| Per-consumer user credentials and permission regexes | Application-specific manifests under applications/lumie/** |
This worker permission excerpt is the key least-privilege invariant:
permissions:
configure: '^grading\..*$'
write: '^(grading\..*|lumie\..*)$'
read: '^grading\..*$'
Backend keeps broader access because it orchestrates flows and may use the management API.
Overlay Behavior And Drift
The base and overlays do not describe the same final broker contract:
| Source | Claim |
|---|---|
platform/rabbitmq/base/rabbitmq-cluster.yaml and configmap.yaml | Still include a definitions-file path and a ConfigMap-backed user definition |
platform/rabbitmq/overlays/prod/kustomization.yaml | Deletes the definitions ConfigMap and removes load_definitions, switching prod to operator-native default-user plus User CRs |
platform/rabbitmq/overlays/dev/kustomization.yaml | Applies the same operator-native pattern to dev and creates app users in app-users.yaml |
Document the overlay behavior as the active contract. The base definitions resources are staging artifacts that the overlays intentionally remove.
The live cluster adds one more operational note: on June 14, 2026, kubectl get rabbitmqcluster rabbitmq -n lumie-event -o yaml reported ReconcileSuccess=False with the message that scale-down from 3 nodes to 1 node is unsupported. That is a live-state warning, not a different desired-state manifest.
Verification
cd lumie-infra
rg -n "RabbitmqCluster|delivery-limit|grading\\.omr|report\\.generation|topology-allowed-namespaces|load_definitions" \
platform/rabbitmq platform/rabbitmq-operator applications/lumie
kubectl get rabbitmqclusters,users,permissions,queues,exchanges -A
kubectl get application rabbitmq rabbitmq-dev rabbitmq-operator -n argocd