OpenClaw
OpenClaw is an internal AI gateway and multi-agent workspace deployment. Unlike Coder, it is mostly hand-authored in manifests because it needs persistent state, seeded agent workspaces, network-policy control, and a custom service account with cluster read access.
Responsibility
- Run the OpenClaw gateway in the
openclawnamespace. - Seed persistent agent workspaces and agent definitions from ConfigMaps into a PVC.
- Authenticate inbound traffic with a bearer token.
- Read cluster state through a restricted service account.
- Reach external LLM and messaging providers over controlled egress.
Source paths
| Path | Role |
|---|---|
lumie-infra/applications/openclaw/argocd.yaml | ArgoCD Application targeting openclaw |
lumie-infra/applications/openclaw/kustomization.yaml | Shared chart plus raw manifests |
lumie-infra/applications/openclaw/common-values.yaml | Service, RBAC, and Vault-backed secret wiring |
lumie-infra/applications/openclaw/manifests/deployment.yaml | Main deployment, init-container seeding, probes, and env wiring |
lumie-infra/applications/openclaw/manifests/configmap.yaml | openclaw.json runtime configuration and agent catalog |
lumie-infra/applications/openclaw/manifests/configmap-workspace.yaml | Seed files copied into workspace directories on startup |
lumie-infra/applications/openclaw/manifests/network-policy.yaml | Default-deny plus explicit ingress and egress rules |
lumie-infra/applications/openclaw/manifests/pvc.yaml | Persistent state volume openclaw-data |
lumie-infra/security/teleport/agent/helm-values.yaml | Teleport app entry and auth header injection |
Public surface and contracts
| Surface | Contract |
|---|---|
| Teleport app | openclaw |
| In-cluster service | openclaw.openclaw.svc.cluster.local:18789 |
| Gateway auth mode | Static token via OPENCLAW_GATEWAY_TOKEN |
| Database | postgresql://openclaw@infra-db-rw.infra-db.svc:5432/openclaw |
| Persistent state | PVC openclaw-data, 500Mi, local-path-retain |
The Teleport app rewrites both Origin and Authorization so browser requests arrive with the expected gateway origin and bearer token.
Runtime flow
Runtime behavior
The deployment uses an init container to copy configuration and seed files into /home/node/.openclaw on the PVC before the main container starts. That startup path does three important things:
- overwrites
openclaw.jsonfrom the ConfigMap; - removes legacy workspace directories
sales,growth, andcs; - seeds the main workspace plus 28 named agent workspaces and agent directories.
The main container then serves the gateway on port 18789 and reads the database password, gateway token, Telegram bot token, and API keys from Vault-backed Kubernetes secrets.
Security and boundary controls
- The namespace uses a default-deny network policy.
- Egress is narrowed to:
- DNS in
kube-system, - the Kubernetes API at
10.43.0.1:443, - external HTTPS on public IP space only,
- PostgreSQL in namespace
infra-db.
- DNS in
- Ingress is limited to TCP
18789. - The service account receives a custom read-only ClusterRole for pods, services, nodes, namespaces, events, configmaps, workloads, ingresses, jobs, and ArgoCD
Applicationobjects.
Dependency notes
openclaw.jsonconfiguresdeepseek/deepseek-chatas the current primary model provider.- Secrets for Anthropic, OpenAI, and Google are also mounted, but the checked-in runtime config does not declare providers that use them today.
Failure behavior and operational risks
- Missing
OPENCLAW_GATEWAY_TOKENorOPENCLAW_DB_PASSWORDleaves the pod running but unusable. - If the PVC becomes stale or corrupted, old workspaces can survive across deploys because the init container only seeds known files and directories.
- Health probes only test whether the process is accepting TCP connections on
18789; they do not validate database access or upstream API reachability. - Egress policy mistakes show up as provider timeouts, Telegram delivery failures, or database connection errors rather than admission failures.
Observability
- Deployment logs are the primary signal for startup seeding, provider errors, and database failures.
- Probe failures show up at the pod level; there is no checked-in
ServiceMonitorfor OpenClaw. - Teleport access issues can be isolated by testing the in-cluster service separately from the public app route.
Verification
kubectl get applications.argoproj.io -n argocd openclaw
kubectl get deploy,pods,svc,pvc,configmap,networkpolicy -n openclaw
kubectl get clusterrole,clusterrolebinding | rg openclaw
kubectl logs -n openclaw deploy/openclaw
kubectl describe pod -n openclaw -l app=openclaw