Teams already running OpenClaw in production fear three things in 2026: floating latest, upgrading without a full state backup, and stopping the Gateway in the wrong order so tokens and volumes diverge. This is not a greenfield install guide. It complements the Windows/macOS/Linux install, Docker production, post-install triage, and Linux systemd + Tunnel articles with pain-point framing, a local vs container upgrade matrix, a symptom–rollback table, backup snippets, a six-step upgrade runbook, and three on-call lines. For Secrets and exposure, continue with the advanced Gateway runbook.
The install guide covers “how to install”; Docker production covers “how to stay up and fix common container issues”; post-install covers “it won’t start—triage by symptom”; Linux + Tunnel covers “systemd without binding the Gateway to the public internet.” This article covers upgrades, host migration, changing deployment shape, and rollback only—written so a change ticket can be signed off. Six recurring pains in real on-call work—paste them straight into the change description.
latest or a floating tag, so rollback cannot reproduce the previous digest.~/.openclaw (or the documented directory) with keys, workspace, and gateway config is not snapshotted with volumes.Use this in review to pick a path per release; field names follow your fork or release notes.
| Dimension | Local path | Docker Compose path |
|---|---|---|
| Version anchor | Pin npm/pnpm and lockfile; record Node minor | Pin image tag or digest; ban silent latest |
| State location | ~/.openclaw and local workspace paths | Bind mount or named volume mapped to a host path |
| Keys and tokens | Env, keychain, or .env (never commit) | .env, Docker secrets, or orchestration vars; export before upgrade |
| Health | CLI / local port probes | compose ps, in-container health command, host ports |
| Rollback lever | Reinstall pinned package + restore directory tarball | Revert image digest + restore volume snapshot |
Complements post-install triage: here the focus is decision order inside an upgrade window. For Control UI and public exposure, align with the advanced runbook.
| Symptom | Suspect first | Do first | If still failing |
|---|---|---|---|
| Gateway restart loop | Bind address vs UI allowlist misaligned | Try loopback binding or fix allowlist-style settings, then retry | Revert previous image digest and restore volume snapshot |
| Pairing / device auth odd | Two instances or rotated tokens clients did not pick up | Fully stop the old instance; list devices in CLI and follow vendor pairing flow | Restore token files from backup and briefly roll back the version |
| Model connectivity timeout | Egress, proxy, or key change | curl from container and host; smallest key rotation test | Check vendor status and firewall policy |
| Disk filled then upgrade fails | Workspace and logs without rotation | Snapshot before aggressive cleanup; free space per vendor guidance before upgrade | Resize disk or move workspace to a dedicated volume |
# Example: tarball state before upgrade (path per your environment; never commit archives to git) tar czf openclaw-state-$(date +%Y%m%d).tgz -C "$HOME" .openclaw # Compose: pin digest or minor tag in .env or compose, then pull & up # docker compose pull && docker compose up -d
Note: restrict backup files that contain secrets to ops roles; rotate any token that might have leaked after restore; keep Secrets audit cadence aligned with the advanced runbook.
latest mid-window..env (redacted copy for audit).If laptops and servers each run a Gateway, tag instances in monitoring—otherwise alerts only say “down” and DNS cutover is guesswork. That extends the token and exposure contract from the Docker production article.
Sleep, OS updates, and surprise disk use turn upgrade windows into random events. A throwaway VPS without snapshots aligned to term length often lacks a disk to roll back to. Treating the Gateway as production requires predictable uptime, recoverable disks, and a region choice, with the runbook wired to monitoring.
Fragmented hosts also fight Secrets governance and audit: who last rotated a token, which machine runs digest X. MACCOME offers Mac Mini M4 / M4 Pro bare-metal nodes across regions with flexible terms—as a stable execution layer or dedicated Gateway host. Review with the multi-region and rental-term guide and SSH vs VNC access decisions, then align plans with rental rates and help center billing and access wording.
For pilots, short-term rent in the target region to rehearse backup–upgrade–rollback before locking a longer contract.
FAQ
What are the three things you must not skip before an upgrade?
Pin image or package version anchors, fully back up the state directory and volumes, and record rollback owner and validation cases in the window. For commercial terms open rental rates and the help center.
After a Gateway upgrade, Control UI misbehaves—what should you check first?
Binding address and health checks; in Docker, loopback vs allowlist-style settings; on local installs, ports and tokens. Step-by-step triage stays in the post-install article.
How does this work with the Docker production guide?
Production covers residency and common container failures; this covers backup order, anchors, and rollback around upgrades. Keep both next to the Docker production article in the same handbook index.