Skip to main content

Infra DB

infra-db is the shared PostgreSQL cluster for platform applications. It is separate from the main product database and lives under lumie-infra/storage/infra-db/**.

Source paths

  • lumie-infra/storage/infra-db/argocd.yaml
  • lumie-infra/storage/infra-db/manifests/cluster.yaml
  • lumie-infra/storage/infra-db/manifests/scheduled-backup.yaml
  • lumie-infra/storage/infra-db/manifests/r2-barman-vss.yaml
  • lumie-infra/storage/infra-db/manifests/vault-static-secret.yaml
  • Consumers:
    • lumie-infra/observability/grafana/common-values.yaml
    • lumie-infra/applications/coder/manifests/vault-static-secret.yaml
    • lumie-infra/applications/umami/common-values.yaml
    • lumie-infra/security/keycloak/helm-values.yaml

Runtime role

Cluster contract

  • instances: 3
  • imageName: zot.lumie-infra.com/storage/postgresql-wal2json:18.1
  • enableSuperuserAccess: true
  • wal_level: logical
  • max_replication_slots: "8"
  • storageClass: local-path-retain
  • R2-backed Barman backup enabled with one scheduled backup at 04:00 KST

Source path: lumie-infra/storage/infra-db/manifests/cluster.yaml

Bootstrap SQL and current drift

The repo manifest bootstraps service databases with inline postInitSQL, for example:

- "CREATE USER grafana WITH PASSWORD 'VAULT_PASSWORD'"
- "CREATE DATABASE grafana OWNER grafana"

Source path: lumie-infra/storage/infra-db/manifests/cluster.yaml

That does not match the way consumers currently render DSNs from Vault secrets such as POSTGRES_PASSWORD. Treat this as contract drift in the repo:

  • the consumer side clearly expects Vault-managed credentials
  • the cluster bootstrap side still shows a literal VAULT_PASSWORD placeholder

Do not smooth that over in operations work. Verify the live cluster state before changing platform app credentials or relying on bootstrap recreation behavior.

Ownership boundaries

  • infra-db is a platform database, not a product application database.
  • It is reconciled directly from raw manifests, not from the shared CNPG Helm template used by lumie-db.
  • It still depends on the shared CNPG operator installed from CloudNativePG.

Failure modes

  • If the R2 credential secret r2-barman-creds fails to refresh, backups stop while the database can remain otherwise healthy.
  • The bootstrap SQL drift means fresh-cluster recovery and steady-state consumer credentials may not be described by the same source file.
  • enableSuperuserAccess: true is necessary for platform administration, but it also means this cluster must never be treated like a tenant-safe runtime data plane.

Verification

kubectl get applications.argoproj.io -n argocd infra-db
kubectl get clusters.postgresql.cnpg.io -n infra-db
kubectl get scheduledbackups.postgresql.cnpg.io -n infra-db
kubectl get backups.postgresql.cnpg.io -n infra-db
kubectl get pods -n infra-db

Observability

  • CNPG cluster monitoring is enabled with a PodMonitor.
  • Grafana includes a CloudNativePG dashboard JSON that is also useful for infra-db.
  • Platform applications depending on infra-db often fail higher up first, so pair cluster checks with consumer logs.