Teams running OpenClaw on Docker Compose rarely fail because images cannot be pulled. They fail because Gateway logs look fine while browsers or the CLI report connection refused, failed WebSocket handshakes, or token errors—often due to listen addresses (bind), port publishing, and whether the CLI shares the Gateway network namespace, not the model API key. This article provides six on-call symptom classes, two matrices for “not running” vs “running but unroutable,” a bind/firewall/publish map, copy-paste diagnostics, a six-step runbook, and three log KPIs. Pair it with the Docker production runbook, doctor post-install triage, and Kubernetes probe guide: production answers how to deploy; this answers why containers cannot see each other or the host the way you think.
The control plane combines Gateway WebSocket/HTTP with CLI / Control UI, plus multiple containers, custom bridges, and network_mode: service:.... Without layered triage, teams churn across .openclaw files without checking whether the listener is reachable from the CLI namespace.
127.0.0.1 inside the container; the host reaches it via published ports, yet another service in the same compose file resolves a different path.gateway:18789 via bridge DNS while Gateway only exposes loopback to the shared service stack—classic “works once, breaks after restart.”curl intermittently succeeds while in-container probes fail after a rolling upgrade leaves old NAT rules.::1 or bad AAAA records on slim images.Track these on the same change ticket as volumes, image digests, and health checks from production: reachability vs version correctness.
Maintain a one-page network topology in the compose repo: which services sit on which network, who publishes ports, and how dev laptops vs CI probe the stack. DNS on custom networks differs from the default bridge; when service names fail, run getent hosts or nslookup inside the container before blaming OpenClaw.
Always run official doctor and gateway status first (see the post-install guide). Do not rotate tokens before you know a listener exists.
| Signal | Likely class | First action | Anti-pattern |
|---|---|---|---|
| No Gateway container or CrashLoop | Not up | docker logs, OOM, probes killing pods | Endless pull without resource checks |
Running but no ss listener inside | Config/bind failure | Check OPENCLAW_GATEWAY_BIND and compose command vs docs | Editing host /etc/hosts only |
Listener OK, CLI wget fails | Cross-namespace routing | Consider network_mode: "service:openclaw-gateway" | Blind 0.0.0.0 without threat modeling |
| Host browser fails, container succeeds | Publish / proxy | Validate ports:, VPN, PAC files | Disabling TLS randomly |
Align with official gateway.bind values such as loopback, lan, tailnet, and auto; compose must also state who publishes ports.
| Goal | Bind / env intent | Compose notes | Security |
|---|---|---|---|
| Local laptop only | Loopback-first; host hits published port | 127.0.0.1:18789:18789 | Do not assume other compose services inherit loopback reachability |
| CLI tightly coupled to Gateway | Share the network stack | network_mode: "service:openclaw-gateway" | Shared port space—avoid duplicate binds |
| LAN debugging | lan or equivalent | Bind 0.0.0.0 vs specific NIC explicitly | Pair with upstream firewall rules |
| Tunnel / reverse proxy | Gateway loopback; TLS at edge | Split networks; verify WebSocket pass-through | No naked admin ports on the public Internet |
# 1) Host: is the port actually published? docker compose ps curl -sv --max-time 2 http://127.0.0.1:18789/ || true # 2) Inside gateway container docker compose exec openclaw-gateway sh -lc 'ss -lntp 2>/dev/null || netstat -lntp' # 3) From CLI container (rename services) docker compose exec openclaw-cli sh -lc 'wget -qO- --timeout=2 http://openclaw-gateway:18789/ || echo FAIL' # 4) Inspect effective compose docker compose config | sed -n '1,200p'
Heads-up: Community reports tie “CLI cannot reach Gateway” to compose files that never put the CLI in the Gateway network namespace. Prove the command block on a test stack before merging to production.
If install paths are unclear, start with the three-platform install guide.
Upgrade headers and path rewrites before touching Gateway TLS knobs.Compatible with HTTP probes from the Kubernetes health-check article when you promote the same stack to orchestration.
ss local address, process name, compose service—any delta needs a ticket.docker port vs iptables NAT after rollouts—stale chains still bite in 2026.Tag Gateway logs for handshake failures separately from upstream 429/5xx; if the latter dominates, pivot to the provider failover guide.
If HTTP probes target loopback while user traffic enters from another interface, you can see all-green probes with all-red users; align probe URLs with the bind policy from table 2 before blaming a release.
Docker Desktop sleep, VPN toggles, and local proxies change how localhost resolves. Production-style automation needs repeatable listen policies, audited compose revisions, and stable host boundaries. Ad-hoc laptops also rarely deliver multi-region egress with bare-metal isolation, which conflicts with always-on Gateway expectations.
For teams that need a reachable, on-call control plane, hosting Gateway on professional cloud Macs usually beats fragile personal hardware. MACCOME provides Mac Mini M4 / M4 Pro bare-metal nodes across Singapore, Japan, Korea, Hong Kong, US East, and US West. After network triage, pair SSH vs VNC with the help center, then finalize rental rates and regional pages.
Pilot on a dedicated test host, archive logs, then promote to the shared compose repo—avoid tribal network_mode knowledge.
Any temporary 0.0.0.0 bind needs a documented rollback and exposure review; triage aims to align who should see the control plane with namespace design, not to maximize listen scope.
FAQ
How is this different from the Docker production runbook?
Production covers images, volumes, and rollouts; this covers reachability. Use the help center plus the production runbook together.
Does the same matrix apply on WSL2?
Same order of operations, different localhost forwarding—stack the WSL2 triage article on top.
Where should I read about regions and rental terms?
If Gateway moves to a cloud Mac, align with the multi-region guide and rental rates before locking SSH egress.