Skip to main content

OpenClaw

OpenClaw is an internal AI gateway and multi-agent workspace deployment. Unlike Coder, it is mostly hand-authored in manifests because it needs persistent state, seeded agent workspaces, network-policy control, and a custom service account with cluster read access.

Responsibility

  • Run the OpenClaw gateway in the openclaw namespace.
  • Seed persistent agent workspaces and agent definitions from ConfigMaps into a PVC.
  • Authenticate inbound traffic with a bearer token.
  • Read cluster state through a restricted service account.
  • Reach external LLM and messaging providers over controlled egress.

Source paths

PathRole
lumie-infra/applications/openclaw/argocd.yamlArgoCD Application targeting openclaw
lumie-infra/applications/openclaw/kustomization.yamlShared chart plus raw manifests
lumie-infra/applications/openclaw/common-values.yamlService, RBAC, and Vault-backed secret wiring
lumie-infra/applications/openclaw/manifests/deployment.yamlMain deployment, init-container seeding, probes, and env wiring
lumie-infra/applications/openclaw/manifests/configmap.yamlopenclaw.json runtime configuration and agent catalog
lumie-infra/applications/openclaw/manifests/configmap-workspace.yamlSeed files copied into workspace directories on startup
lumie-infra/applications/openclaw/manifests/network-policy.yamlDefault-deny plus explicit ingress and egress rules
lumie-infra/applications/openclaw/manifests/pvc.yamlPersistent state volume openclaw-data
lumie-infra/security/teleport/agent/helm-values.yamlTeleport app entry and auth header injection

Public surface and contracts

SurfaceContract
Teleport appopenclaw
In-cluster serviceopenclaw.openclaw.svc.cluster.local:18789
Gateway auth modeStatic token via OPENCLAW_GATEWAY_TOKEN
Databasepostgresql://openclaw@infra-db-rw.infra-db.svc:5432/openclaw
Persistent statePVC openclaw-data, 500Mi, local-path-retain

The Teleport app rewrites both Origin and Authorization so browser requests arrive with the expected gateway origin and bearer token.

Runtime flow

Runtime behavior

The deployment uses an init container to copy configuration and seed files into /home/node/.openclaw on the PVC before the main container starts. That startup path does three important things:

  • overwrites openclaw.json from the ConfigMap;
  • removes legacy workspace directories sales, growth, and cs;
  • seeds the main workspace plus 28 named agent workspaces and agent directories.

The main container then serves the gateway on port 18789 and reads the database password, gateway token, Telegram bot token, and API keys from Vault-backed Kubernetes secrets.

Security and boundary controls

  • The namespace uses a default-deny network policy.
  • Egress is narrowed to:
    • DNS in kube-system,
    • the Kubernetes API at 10.43.0.1:443,
    • external HTTPS on public IP space only,
    • PostgreSQL in namespace infra-db.
  • Ingress is limited to TCP 18789.
  • The service account receives a custom read-only ClusterRole for pods, services, nodes, namespaces, events, configmaps, workloads, ingresses, jobs, and ArgoCD Application objects.

Dependency notes

  • openclaw.json configures deepseek/deepseek-chat as the current primary model provider.
  • Secrets for Anthropic, OpenAI, and Google are also mounted, but the checked-in runtime config does not declare providers that use them today.

Failure behavior and operational risks

  • Missing OPENCLAW_GATEWAY_TOKEN or OPENCLAW_DB_PASSWORD leaves the pod running but unusable.
  • If the PVC becomes stale or corrupted, old workspaces can survive across deploys because the init container only seeds known files and directories.
  • Health probes only test whether the process is accepting TCP connections on 18789; they do not validate database access or upstream API reachability.
  • Egress policy mistakes show up as provider timeouts, Telegram delivery failures, or database connection errors rather than admission failures.

Observability

  • Deployment logs are the primary signal for startup seeding, provider errors, and database failures.
  • Probe failures show up at the pod level; there is no checked-in ServiceMonitor for OpenClaw.
  • Teleport access issues can be isolated by testing the in-cluster service separately from the public app route.

Verification

kubectl get applications.argoproj.io -n argocd openclaw
kubectl get deploy,pods,svc,pvc,configmap,networkpolicy -n openclaw
kubectl get clusterrole,clusterrolebinding | rg openclaw
kubectl logs -n openclaw deploy/openclaw
kubectl describe pod -n openclaw -l app=openclaw