Skip to main content

Coder

Coder is Lumie's self-hosted development workspace control plane. The server runs in the coder namespace, but the workspaces it creates run as pods in lumie-dev, where they share the stable dev-workspace service used by Teleport for the Tilt UI, code-server, and the agents UI.

Responsibility

  • Provide the control plane for per-developer workspaces.
  • Authenticate operators through the Keycloak infra realm.
  • Persist control-plane state in the shared infra-db PostgreSQL cluster.
  • Spawn workspace pods and home PVCs in lumie-dev.

Source paths

PathRole
lumie-infra/applications/coder/argocd.yamlArgoCD Application targeting namespace coder
lumie-infra/applications/coder/kustomization.yamlHelm chart entrypoint plus extra manifests
lumie-infra/applications/coder/helm-values.yamlServer settings, OIDC, RBAC, and database wiring
lumie-infra/applications/coder/manifests/vault-static-secret.yamlVault-to-Kubernetes secret sync for DB and OIDC credentials
lumie-infra/applications/coder/template/main.tfWorkspace template that creates the PVC and pod in lumie-dev
lumie-infra/applications/coder/template/Dockerfile.dev-workspaceBase image for the main dev container
lumie-infra/applications/lumie/develop/common-values.yamlStable dev-workspace ServiceAccount and namespace-scoped RBAC
lumie-infra/applications/lumie/develop/manifests/dev-workspace-service.yamlStable ClusterIP service for Tilt, code-server, and agents
lumie-infra/security/teleport/agent/helm-values.yamlExternal access path at coder.lumie-infra.com

Ownership boundaries

  • ArgoCD manages the Coder server, its namespace metadata, and the Vault-backed secrets in coder.
  • The Coder workspace template manages per-workspace PVCs and pods in lumie-dev.
  • The lumie-dev namespace overlay, not Coder, owns the persistent dev-workspace ServiceAccount and singleton dev-workspace service.
  • Teleport owns external browser access to the Coder server.

Public surface and contracts

SurfaceContract
Teleport appcoder.lumie-infra.com, declared in security/teleport/agent/helm-values.yaml
In-cluster servicecoder.coder.svc.cluster.local:80
OIDC issuerhttps://auth.lumie-edu.com/realms/infra
OIDC clientclientId: coder with redirect https://coder.lumie-infra.com/api/v2/users/oidc/callback
Databasepostgresql://coder@infra-db-rw.infra-db.svc.cluster.local:5432/coder via coder-db-secret
Workspace namespacelumie-dev only, through serviceAccount.workspaceNamespaces

Runtime flow

Workspace template behavior

The workspace template in template/main.tf is the real runtime contract for the developer pod:

  • It creates a home PVC named coder-<owner>-<workspace>-home with 30Gi on local-path.
  • It runs three containers:
    • dev for builds, shells, Tilt, kubectl, Docker CLI, and code-server.
    • agents for the browser UI on port 3000.
    • dind as a privileged sidecar for image builds.
  • It downloads the Coder agent from the in-cluster URL http://coder.coder.svc.cluster.local because the public Teleport URL cannot be used from inside the workspace pod.
  • It mounts the shared dev-workspace ServiceAccount, which is bound to the Kubernetes admin ClusterRole inside lumie-dev only.

The stable dev-workspace service is intentionally kept out of the template because each workspace has separate Terraform state; otherwise the second workspace creation would race on the singleton service object.

Secret and dependency wiring

  • coder-db-secret-vss renders the PostgreSQL connection URL from Vault path secret/infrastructure/coder.
  • coder-oidc-secret-vss renders the Keycloak client secret from the same Vault path.
  • CODER_PROXY_TRUSTED_HEADERS and CODER_PROXY_TRUSTED_ORIGINS trust Teleport's in-cluster forwarding headers.
  • CODER_WILDCARD_ACCESS_URL is explicitly blank, so Lumie does not use Coder's wildcard subdomain app proxying.

Failure behavior and operational risks

  • Missing or stale coder-db-secret or coder-oidc-secret blocks server startup or OIDC login.
  • If Teleport stops rewriting the Host header to coder.lumie-infra.com, browser sessions and WebSocket checks fail even when the Coder deployment is healthy.
  • Workspace creation can succeed while the user-facing tools fail if the shared dev-workspace service or its ports drift from the template.
  • The dind sidecar is privileged by design; admission-policy or runtime-policy changes in lumie-dev can break image-build workflows first.
  • Workspace pods are pinned away from master nodes, so worker-node pressure can leave workspaces pending.

Observability

  • Control-plane health is visible through the ArgoCD app coder and the deployment logs in coder.
  • Workspace failures are easiest to diagnose from the pod logs in lumie-dev.
  • There is no checked-in ServiceMonitor for Coder in this repo; logs and pod readiness are the primary signals.

Verification

kubectl get applications.argoproj.io -n argocd coder
kubectl get deploy,pods,secrets -n coder
kubectl get svc -n coder
kubectl get sa,rolebinding -n lumie-dev dev-workspace
kubectl get svc -n lumie-dev dev-workspace
kubectl get pods,pvc -n lumie-dev | rg '^coder-'
kubectl logs -n coder deploy/coder