Skip to main content

Grafana

Grafana is Lumie's operator-facing visualization hub. It is deployed separately from Prometheus and stores its state in the shared infra-db cluster instead of a local SQLite database or PVC.

Source paths

  • lumie-infra/observability/grafana/argocd.yaml
  • lumie-infra/observability/grafana/helm-values.yaml
  • lumie-infra/observability/grafana/common-values.yaml
  • lumie-infra/observability/grafana/dashboards/*.json
  • lumie-infra/security/teleport/agent/helm-values.yaml

Runtime contract

  • Runs one replica in the grafana namespace.
  • Uses PostgreSQL backend storage in infra-db, not local persistence.
  • Reads database connection fields from the grafana-db-app secret rendered by VaultStaticSecret.
  • Exposes these datasources:
    • Thanos as the default Prometheus-compatible source
    • Prometheus as a secondary metrics source
    • Loki
    • Alertmanager
    • Tempo

Datasource surface

- name: Thanos
url: http://thanos-query.thanos.svc.cluster.local:9090
isDefault: true
- name: Loki
url: http://loki.loki.svc.cluster.local:3100
- name: Tempo
url: http://tempo.tempo.svc.cluster.local:3100

Source path: lumie-infra/observability/grafana/helm-values.yaml

Access boundary

  • Helm ingress is disabled.
  • Teleport publishes the grafana app and proxies http://grafana.grafana.svc.cluster.local:80.
  • grafana.ini.server.root_url is still set to https://grafana.lumie-infra.com, which is the expected external origin for reverse-proxied access.

Dashboard management

  • JSON dashboards are stored in lumie-infra/observability/grafana/dashboards/.
  • The Helm values explicitly say dashboards are manually imported through the UI.
  • Because Grafana stores state in PostgreSQL, imported dashboards survive pod recreation even though persistence.enabled is false.

Failure modes

  • If infra-db is degraded or its secret wiring drifts, Grafana fails before dashboards can load even when Prometheus, Loki, and Tempo are healthy.
  • If teams expect dashboards to auto-provision from JSON files, they will miss that the repo currently treats those files as reference artifacts, not an auto-mounted provisioning contract.
  • Anonymous viewer access is enabled and the login form is disabled, so access control depends on the surrounding proxy and operator path rather than Grafana-local credentials.

Verification

kubectl get applications.argoproj.io -n argocd grafana
kubectl get pods -n grafana
kubectl get secret -n grafana grafana-db-app
kubectl describe deploy -n grafana grafana

Success means the grafana Application is Synced/Healthy, Grafana pods are Ready, grafana-db-app exists with database credentials, and the deployment describes the expected chart-managed container and environment wiring.