2026 OpenClaw Docker Compose connectivity & pairing: 1006 vs 1008, one token source, Unix sockets & shared namespaces

~20 min read · MACCOME

Audience: teams running Gateway and openclaw CLI in separate Compose services, seeing Gateway logs OK while CLI hits gateway closed (WebSocket 1006/1008), pairing required, or twin token tracks (env versus file). This page argues: fix the gateway URL semantics (service name TCP or shared Unix socket), collapse OPENCLAW_GATEWAY_TOKEN and gateway.auth.* into a single source of truth, then—only if justified—narrow the calling surface via shared network namespaces. Scope versus April’s sub-agent 1008 + trustedProxies article: that one optimizes trusted CIDR paths; here we focus how each container literally addresses the Gateway plus matching tokens—cross-read with Docker network checklist, pairing & tokens, and volume permissions.

Five “looks-like-networking” root causes inside Compose splits

  1. Hard-coded 127.0.0.1: loopback names the current container only; use Compose DNS such as http://openclaw-gateway:18789 (adapt service names).
  2. Dual token sources drifting: OPENCLAW_GATEWAY_TOKEN may override filesystem config; mismatched mounts look like intermittent 401/1008 flips.
  3. Unix socket not co-mounted: if Gateway emits gateway.sock, both sides need the identical host path mapped with aligned UID/GID (see volumes guide).
  4. 1006 misread as “flaky websocket”: 1006 usually marks abnormal closure—crash, upstream timeout, or reverse-proxy retry—audit Gateway restarts before touching pairing knobs.
  5. Need loopback parity but stayed on default bridge: docs that demand localhost semantics may require network_mode: service:<gateway> or consistently stick to DNS TCP URLs—never A/B hybrid env blobs in one change ticket.

Treat the five bullets as release-blocker checkboxes until each is green or explicitly waived—changing image tags first rarely fixes unresolved namespace/token contracts.

Add a timestamp triad when WebSocket closes: align Gateway stderr, CLI stdout, and docker events to the millisecond—“random” outages often correlate with restarts or aggressive health checks.

SymptomInvestigate next (top-down)Usually lands here
1006 plus fresh Gateway PID / OOMsame-window host/kernel logs → restart budgets → timeoutstransport closed bad pairing policy
1008 plus pairing requiredtoken truth → Gateway URL env → persisted pairing bits on diskdual mounts or unreadable state dirs
Connection refused inside CLI container onlymove probes into that container; host curls are anecdotesloopback misconception or unpublished port mapping
Unix socket ENOENT / permission deniedls -la mounts; confirm parent dirs share the bindsocket missing from child namespace path
warning

Closure codes cannot stand alone: 1006 and 1008 are labels—you must correlate them with Gateway logs, pairing state machines, and any reverse-proxy retry budgets; otherwise widening bind or blindly re-onboarding only grows exposure.

Six steps—prove reachability before celebrating pairing

  1. Declare one authoritative token outlet (config file XOR env—not both half-written); verify rendered output with docker compose config plus CI grep hooks.
  2. Freeze URL semantics: either Compose DNS TCP end-to-end, or Unix socket everywhere with one shared bind—do not interleave protocols per environment.
  3. Mount .openclaw state identically: Gateway and CLI must read the same writable tree; pairing appears “stuck on step zero” when state is RO—walk volume checklist.
  4. Gateway status → doctor: openclaw gateway status, then openclaw doctor; align timestamps between host daemon logs and inner-container logs.
  5. When localhost parity truly matters: document network_mode: service:<gateway> OR mesh-side alternatives plus firewall rows consistent with trustedProxies playbooks.
  6. Closure evidence bundle: attach CLI-namespace probe screenshots plus expanded Gateway URL/token hash fingerprint—voice-only “curl works on laptop” rejects the ticket.

If an outer Nginx/Caddy fronts the Gateway, reconcile Upgrade/idle timeouts between reverse-proxy guidance and the Compose-facing URL—or intermittent 1006 becomes indistinguishable from auth bugs.

yaml
# Placeholder illustration—replace names/paths and run docker compose config
services:
  openclaw-gateway:
    environment:
      - OPENCLAW_CONFIG_DIR=/data/.openclaw
      - OPENCLAW_GATEWAY_TOKEN=${OPENCLAW_GATEWAY_TOKEN}
    volumes:
      - oc-data:/data/.openclaw
    networks: [oc-net]

  openclaw-cli:
    environment:
      - OPENCLAW_CONFIG_DIR=/data/.openclaw
      - OPENCLAW_GATEWAY_URL=http://openclaw-gateway:18789
      - OPENCLAW_GATEWAY_TOKEN=${OPENCLAW_GATEWAY_TOKEN}
    volumes:
      - oc-data:/data/.openclaw
    networks: [oc-net]
    depends_on:
      - openclaw-gateway

networks:
  oc-net:
    driver: bridge

volumes:
  oc-data:

Three engineering facts for your change request (not vendor SLAs)

  • Probes must execute in the failing namespace—otherwise you keep a Schrödinger Gateway that is “green on laptop, red in CI.”
  • Hash both token surfaces (config file + env) before ship; short fingerprints beat eyeballing strings.
  • Pairing correlates with restart storms: thrashing liveness probes amplify 1006 long before auth work helps—tune readiness first.
  • Rule zero for cross-container debugging: any command not run inside the broken image is chat, not RCA.

Why throwaway laptop containers rarely scale OpenClaw production chains

Demos are fine on a single host; production automation plus rotating secrets needs predictable restarts, disk baselines, and audit-friendly pairing history. Parking the Gateway on a dedicated remote Mac with known Six-region coverage and monthly/quarterly rental math usually pairs better with CI secret cadence than personal machines that sleep: MACCOME Apple Silicon nodes keep “Gateway always on” and “human pairing windows” on one operational contract. Start at the public Mac mini rental rates page and help center, then map env vars from this runbook.

Close: contract first, heroics second

Almost every Compose failure here is namespace + token + state directory written as three different documents. Until you show a repeatable in-container handshake capture, do not “swap images” or “disable TLS”; when code vs infra debates flare, elevate whichever side can replay the successful websocket inside the container—everything else waits in queue.

After contracts hold, integrate with GHCR + Control UI flows and hardened reverse proxies.

FAQ

Overlap with April’s trustedProxies pairing article?

That playbook covers coarse CIDRs and sub-agent traffic; here we wire URLs, sockets, namespaces, and single-source tokens—read both, discard neither.

If host curl succeeds, is CI necessarily fine?

No. Re-run probes inside the same container image that failed; mismatched namespaces make host curls noise.