Who hits this: OpenClaw is installed via Docker or locally, but the openclaw CLI keeps throwing WebSocket 1006/1008 during first pairing, onboarding, or in-container runs, or logs show token mismatch even after you edit config files.Takeaway: align environment-variable overrides, the actual WebSocket URL the CLI uses, and the pairing state machine on one matrix, then chain openclaw doctor with the Docker networking article.Outline: six common misreads, symptom matrix, fingerprint commands, six-step runbook, three KPIs, and a hosting decision close.
gateway.auth.token still yield mismatch? Six frequent misreadsIn 2025–2026 community triage, OpenClaw Gateway pairing and auth issues are often conflated with “the network is down”: logs mix close codes with model errors, so people assume the upstream LLM failed. Print these six on-call traps next to post-install doctor and Gateway health checks on your wiki landing page.
OPENCLAW_GATEWAY_TOKEN silently overrides the config file: environment variables injected by containers or launchd units win; you changed gateway.auth.token on disk but the process still reads the old value, so you see “restart and still mismatch.”127.0.0.1: if Compose never points the CLI Gateway URL at the openclaw-gateway service name, the handshake fails early; logs may only show 1006/1008 without an app-layer error, overlapping the Docker network triage checklist..openclaw tree under the user profile and another under the project; the path the CLI actually loads is not the file you have open in the editor.Relationship to official install scripts and npm global paths: the install article guarantees “binaries and Node versions are on PATH”; this article guarantees “CLI and Gateway speak the same token and the same WebSocket endpoint.” Both belong in the same first-day runbook, in order.
Use this matrix for first-pass triage: if a row matches, produce reproducible command output for that check before going deeper; avoid changing token, Compose, and reverse proxy all at once.
| Surface symptom you see | Likely stack first | Immediate check | Next doc |
|---|---|---|---|
Logs: token mismatch and editing the file does nothing | Environment overrides / multiple configs | Print OPENCLAW_GATEWAY_* in the process environment; compare the actually loaded path | This article §3 fingerprint script; post-install doctor article |
| Fails only in the container; host works | Loopback / service name / DNS | From the container, curl or nc the Gateway port; verify the WebSocket URL host | Docker network triage checklist |
| 1008 plus 401/403 semantics or explicit auth failure | Auth config or reverse proxy stripping headers | Reproduce on loopback direct; compare response headers with and without the proxy | Nginx/Caddy reverse proxy and WebSocket article |
| Frequent 1006 with no clear auth error | Idle disconnects, probes killing sessions, version skew | Align CLI and Gateway versions; check Gateway logs for deliberate session recycle | Gateway no-reply and doctor article |
| Onboarding UI/CLI appears stuck | State machine incomplete / port collision | Check listen-port conflicts; before re-pairing, clear transient state per upstream guidance | Official troubleshooting; this article runbook |
| Reinstall “connects once” then immediately drops | Old token still injected somewhere | Inspect systemd drop-ins, shell profiles, CI variables | Install script article pin and proxy fallback section |
Paste outputs into the ticket; replace placeholder roots with your config root. When reviewing with Docker volumes and permissions, confirm mounts are not masking a new volume with an old config directory.
# A) Environment variables visible in this shell (watch case and prefixes) env | sort | grep -i OPENCLAW || true # B) Example only: if systemd manages the gateway, check drop-ins for injected tokens # systemctl show openclaw-gateway --property=Environment 2>/dev/null || true # C) CLI version and doctor (shallow first—avoid blind --fix in production) openclaw --version || true openclaw doctor 2>/dev/null | sed -n '1,40p' || true # D) Print the CLI-side gateway URL (exact subcommand depends on your installed build) # openclaw config get gateway.remoteUrl # example name, placeholder # E) Docker: in the container that runs the CLI, confirm the ws target is not a mistaken 127.0.0.1:18789 # docker compose exec cli sh -lc 'env | grep -i OPENCLAW; getent hosts openclaw-gateway || true'
Note: In community issues, mismatched tokens between environment variables and files often cause long onboarding stalls; capture full output from steps A/B before debating Compose changes.
OPENCLAW_GATEWAY_TOKEN and similar injections one by one, restart processes until the environment is clean, then restore a single source of truth in the config file.doctor --deep exists in your build, use it inside the change window and archive the output.Engineering alignment note (community and ops experience, not lab benchmarks): in public issues, dual token tracks and container loopback mis-targeting stay near the top of “first deploy failed” themes; after adding environment-variable audits to change templates, mean triage rounds usually drop. More importantly, these failures are weakly correlated with GHz; more RAM alone rarely fixes a bad handshake.
If the Gateway must stay online 24/7 without fighting laptop sleep or power settings, put “stable dedicated execution” and “pairing/upgrade windows” in the same SRE doc—this matches enterprises that keep agent gateways on remote Macs.
On personal hardware the Gateway is exposed to sleep, VPN flips, and enterprise certificate churn, which makes pairing state machines harder to audit and replay; when token rotation spans multiple people’s CI, laptops also lack a stable hostname and loopback boundary, so logs fragment.
Placing the Gateway on a dedicated remote Mac that you can restart predictably, with known disk and log behavior, and on the same network as team runners, usually converges onboarding issues faster than drifting across several personal machines. Teams that need Apple Silicon online continuously with a CI-aligned secret model can use MACCOME Mac mini M4 / M4 Pro multi-region nodes and flexible rental terms to keep “pairing triage” and “stable execution” on one invoice and change cadence. Read the public pricing page first, then align operations with the remote Mac unattended operations checklist.
Pilot idea: pick one remote host in the same region as primary CI, deploy only Gateway plus a minimal smoke job, run this article’s six-step runbook in a bi-weekly review, then decide whether to move interactive development into the same topology.
FAQ
Should I run doctor first or change the token first?
Triage pairing versus token using Table 1 in this article; if you have already confirmed a network layer issue, run doctor’s network checks in parallel. Public pricing and regions: Mac mini rental rates.
Is 1006 always “less serious” than 1008?
Not necessarily. Read adjacent log lines and whether the failure is stable; treat close codes as labels, not conclusions, so you do not skip auth checks.
Is it okay to export a long-lived token in production containers?
Not recommended. Prefer short-lived credentials injected by the orchestrator or a secrets sidecar, with a single declared source of truth; otherwise rotations almost always create dual tracks.