Trivy
Trivy is deployed as Trivy Operator in namespace trivy-system. The checked-in configuration keeps the scope intentionally narrow: vulnerability scanning and config-audit scanning are enabled, while RBAC assessment, infra assessment, cluster compliance, exposed-secret scanning, and SBOM generation are disabled.
Responsibility
- Watch deployed workloads and trigger vulnerability/config scans.
- Run scan jobs against images mirrored through the internal Zot registry.
- Publish operator metrics to Prometheus.
- Keep scan cadence deployment-driven rather than TTL-driven.
Source paths
| Path | Role |
|---|---|
lumie-infra/security/trivy/argocd.yaml | ArgoCD Application targeting namespace trivy-system |
lumie-infra/security/trivy/helm-values.yaml | Operator behavior, scanner resources, exclusions, and metrics |
Runtime contract
| Surface | Contract |
|---|---|
| Enabled scanners | vulnerability and config audit |
| Disabled scanners | RBAC assessment, infra assessment, cluster compliance, exposed secret, SBOM |
| Scan job timeout | 10m |
| Concurrent scan jobs | 2 |
| Rescan cadence | TTL disabled; scans on new image deployment only |
| Excluded namespaces | kube-system, kube-public, kube-node-lease, lumie-worker |
Runtime flow
Important implementation details
- The operator image is mirrored to
zot.lumie-infra.com/aquasec/trivy-operator:0.28.0. - Scanner jobs use standalone Trivy with
5Gistorage and500Mimemory. scannerReportTTLandOPERATOR_SCANNER_REPORT_TTLare both empty, so reports are not periodically re-generated on a timer.
Failure behavior and operational risks
- Large images or slow registries can exceed the
10mscan-job timeout. - Because TTL-based rescans are disabled, old reports remain until a new deployment or another trigger causes a fresh scan.
- Namespace exclusions are part of the desired contract; workloads in
lumie-workerare deliberately outside this operator's scan surface. - The operator can be healthy while individual scan jobs fail from registry auth, resource pressure, or unsupported image layouts.
Observability
- Prometheus scraping is enabled with
serviceMonitor.enabled: true. - Operator logs explain scheduling, report generation, and scan-job failures.
- Scan artifacts live in Trivy report custom resources installed by the operator.
Verification
kubectl get applications.argoproj.io -n argocd trivy
kubectl get deploy,pods -n trivy-system -l app.kubernetes.io/instance=trivy
kubectl logs -n trivy-system -l app.kubernetes.io/instance=trivy --all-containers --tail=200
kubectl api-resources | rg 'trivy|report'
kubectl get jobs -n trivy-system
Success signals:
- The
trivyArgo CD application isHealthyandSynced. - The Trivy operator deployment is available in
trivy-system. - The report CRDs are present in
kubectl api-resources. - Scan jobs complete without repeated
scan job timeoutor registry-auth failures in the operator logs.