Skip to main content

Trivy

Trivy is deployed as Trivy Operator in namespace trivy-system. The checked-in configuration keeps the scope intentionally narrow: vulnerability scanning and config-audit scanning are enabled, while RBAC assessment, infra assessment, cluster compliance, exposed-secret scanning, and SBOM generation are disabled.

Responsibility

  • Watch deployed workloads and trigger vulnerability/config scans.
  • Run scan jobs against images mirrored through the internal Zot registry.
  • Publish operator metrics to Prometheus.
  • Keep scan cadence deployment-driven rather than TTL-driven.

Source paths

PathRole
lumie-infra/security/trivy/argocd.yamlArgoCD Application targeting namespace trivy-system
lumie-infra/security/trivy/helm-values.yamlOperator behavior, scanner resources, exclusions, and metrics

Runtime contract

SurfaceContract
Enabled scannersvulnerability and config audit
Disabled scannersRBAC assessment, infra assessment, cluster compliance, exposed secret, SBOM
Scan job timeout10m
Concurrent scan jobs2
Rescan cadenceTTL disabled; scans on new image deployment only
Excluded namespaceskube-system, kube-public, kube-node-lease, lumie-worker

Runtime flow

Important implementation details

  • The operator image is mirrored to zot.lumie-infra.com/aquasec/trivy-operator:0.28.0.
  • Scanner jobs use standalone Trivy with 5Gi storage and 500Mi memory.
  • scannerReportTTL and OPERATOR_SCANNER_REPORT_TTL are both empty, so reports are not periodically re-generated on a timer.

Failure behavior and operational risks

  • Large images or slow registries can exceed the 10m scan-job timeout.
  • Because TTL-based rescans are disabled, old reports remain until a new deployment or another trigger causes a fresh scan.
  • Namespace exclusions are part of the desired contract; workloads in lumie-worker are deliberately outside this operator's scan surface.
  • The operator can be healthy while individual scan jobs fail from registry auth, resource pressure, or unsupported image layouts.

Observability

  • Prometheus scraping is enabled with serviceMonitor.enabled: true.
  • Operator logs explain scheduling, report generation, and scan-job failures.
  • Scan artifacts live in Trivy report custom resources installed by the operator.

Verification

kubectl get applications.argoproj.io -n argocd trivy
kubectl get deploy,pods -n trivy-system -l app.kubernetes.io/instance=trivy
kubectl logs -n trivy-system -l app.kubernetes.io/instance=trivy --all-containers --tail=200
kubectl api-resources | rg 'trivy|report'
kubectl get jobs -n trivy-system

Success signals:

  • The trivy Argo CD application is Healthy and Synced.
  • The Trivy operator deployment is available in trivy-system.
  • The report CRDs are present in kubectl api-resources.
  • Scan jobs complete without repeated scan job timeout or registry-auth failures in the operator logs.