Ansible
Purpose
Ansible is Lumie's mutable bootstrap layer between raw OCI instances and a GitOps-managed cluster. This page is a reference document for the playbooks, roles, inventory generation, and bootstrap side effects that happen before Argo CD takes over.
Source Paths
| Path | Role |
|---|---|
lumie-infra/provision/ansible/ansible.cfg | Runtime defaults, SSH behavior, role path, and parallelism |
lumie-infra/provision/ansible/inventory/terraform_inventory.py | Dynamic inventory generator from terraform output -json |
lumie-infra/provision/ansible/group_vars/*.yml | K3s version, master URL, node grouping, and shared defaults |
lumie-infra/provision/ansible/playbooks/site.yml | Full end-to-end cluster bootstrap |
lumie-infra/provision/ansible/roles/common/tasks/main.yml | OS prep, iptables reset, kernel modules, and sysctl |
lumie-infra/provision/ansible/roles/storage-setup/tasks/main.yml | Formatting and mounting MinIO block devices |
lumie-infra/provision/ansible/roles/k3s-master/tasks/main.yml | K3s server install and token export |
lumie-infra/provision/ansible/roles/k3s-worker/tasks/main.yml | K3s agent join flow |
lumie-infra/provision/ansible/roles/argocd-bootstrap/tasks/main.yml | Helm install, bootstrap secrets, Git clone, and root app apply |
lumie-infra/provision/ansible/playbooks/fetch-kubeconfig.yml | Operator kubeconfig export |
Entrypoints
The main operator surface is the playbook set:
| Playbook | Purpose |
|---|---|
playbooks/site.yml | Full bootstrap: prep, master, storage, workers, verify, Argo CD |
playbooks/k3s-master.yml | Master-only install |
playbooks/k3s-workers.yml | Worker-only rollout after a master already exists |
playbooks/fetch-kubeconfig.yml | Save public and private kubeconfig files locally |
playbooks/k3s-reset.yml | Destructive removal of K3s from all nodes |
site.yml is the authoritative sequence:
- name: Prepare all nodes
- name: Install K3s Master
- name: Setup storage partitions on workers
- name: Install K3s Workers (Account 0214)
- name: Install K3s Workers (Account 0213)
- name: Verify cluster
- name: Bootstrap ArgoCD and GitOps
Runtime Flow
Inventory Contract
inventory/terraform_inventory.py is the bridge between Terraform and Ansible:
- it shells out to
terraform output -json; - it builds
masters,workers_0214, andworkers_0213groups; - it injects
ansible_host,private_ip, andk3s_master_url; - it derives the worker join target from the actual master private IP instead of a copied static file.
That means Ansible bootstrap implicitly depends on a successful and up-to-date Terraform apply.
Role Behavior
common
The common role intentionally normalizes base OS state:
- waits for cloud-init completion;
- installs packages including
iptables,netfilter-persistent, andopen-iscsi; - sets all default iptables chain policies to
ACCEPT; - flushes existing iptables rules;
- loads
br_netfilterandoverlay; - applies K3s-related sysctls.
This is more invasive than a typical application bootstrap. It assumes these nodes are dedicated cluster hosts.
storage-setup
The storage role formats /dev/sdb as ext4, labels it per host as minio-<hostname>, and mounts it at /mnt/minio-data. The label-based mount is the idempotency guard: reruns do not reformat a correctly labeled and mounted disk.
k3s-master
The master role:
- downloads
get.k3s.ioonly when needed; - templates
/etc/rancher/k3s/registries.yaml; - installs K3s server;
- waits for the local API and node token;
- reads the token into an Ansible fact for worker joins.
k3s-worker
The worker role:
- delegates token reads to the master;
- waits for TCP reachability to
10.0.0.241:6443; - installs
k3s-agent; - waits until
kubectl get nodeson the master shows the worker.
The playbooks apply workers serially within each tenancy group to reduce race conditions and make failures easier to pinpoint.
argocd-bootstrap
The bootstrap role performs the last imperative steps before GitOps:
- Installs Helm if needed.
- Creates
argocd,minio, andvaultnamespaces. - Creates
minio-root-passwordandvault-config-secret. - Installs Argo CD from Helm with a minimal bootstrap values file.
- Clones
lumie-infrainto/tmp. - Applies the configured root app manifests.
The bootstrap values intentionally enable anonymous Argo CD access for the first install. Git-managed Argo CD configuration is expected to replace that bootstrap posture afterward.
Kubeconfig Export
fetch-kubeconfig.yml reads /etc/rancher/k3s/k3s.yaml from the master and writes:
k3s-public.yamlpointing at the master's public IP;k3s-private.yamlpointing at the master's private IP;configas a symlink to the public variant.
This makes local operator access explicit instead of reusing the server-local 127.0.0.1 kubeconfig.
Contract Drift
Two inspected mismatches are important:
roles/argocd-bootstrap/defaults/main.ymlstill includesweb-apps/application.yamlinapp_of_apps_paths, butlumie-infra/web-apps/does not exist in the current repo. A fresh rerun of the bootstrap role would need that path fixed first.provision/ansible/README.mdstill states that Traefik is disabled, but the actual server role does not disable it and the live cluster inspected on June 14, 2026 has the bundled Traefik addon running.
Failure Modes
| Failure point | Impact |
|---|---|
| Terraform outputs stale or missing | Dynamic inventory generation fails before any host work starts |
k3s_master_private_ip incorrect | Worker join waits time out even though SSH works |
/dev/sdb absent on a worker | Storage role hard-fails before the worker rollout continues |
| Vault secret bootstrap omitted | Vault cannot start with its S3 backend config, blocking downstream VaultStaticSecret consumers |
| Missing root app path | Argo CD bootstrap fails during initial app apply |
Verification
Inventory and connectivity:
cd lumie-infra/provision/ansible
./inventory/terraform_inventory.py --list | jq .
ansible all -i inventory/terraform_inventory.py -m ping
Dry-run the bootstrap logic:
ansible-playbook -i inventory/terraform_inventory.py playbooks/site.yml --check
Live cluster confirmation after a real run:
kubectl get nodes -o wide
kubectl get applications -n argocd
Success signals:
./inventory/terraform_inventory.py --listincludes themasters,workers_0214, andworkers_0213groups plusk3s_master_url.ansible all ... -m pingreturnsSUCCESSfor every host in the generated inventory.ansible-playbook ... playbooks/site.yml --checkreaches thePLAY RECAPwithout failed hosts for the current bootstrap contract.- After a real run,
kubectl get nodes -o wideshows joined workers andkubectl get applications -n argocdshows the root GitOps apps created byargocd-bootstrap.