2026 OpenClaw Agent sandbox: Docker image, `docker.user` & volume permissions, missing tools & OOM/exit-137 triage

About 15 min read · MACCOME

You already run OpenClaw Gateway on Docker / Compose, but turning on the Agent sandbox shows containers that fail, read-only or unwritable workspace, tools missing in the sandbox, and OOM exits with 137 on build. This article condenses 2026 mainline practice into an order you can run: off, on, verify, fix, with key settings, docker.user and volume permissions, a build/oom triage table and our existing Docker, volume, and doctor articles. After reading, you can tell a bad image, a mount UID problem, or a cap-and-log-signature problem apart.

The pain: why the sandbox “has everything locally but not in the container”

Usually the root cause is the process boundary, not a broken install:

  1. The root file system is not the host’s: the sandbox is its own image. Tools you brew install on the host do not appear in the sandbox PATH unless they are in the image or you add extra mounts and layers.
  2. Workspace is bind-mounted: if the host directory is 501:20 but the process in the container runs as 1000:1000, you often get read or write denials, logged as Permission denied or EROFS.
  3. Another security boundary: even if Gateway can reach the network, DNS, egress, and the toolset inside the sandbox are their own design. It is a different path from “SSH in and install software on the host.”

One-page comparison: Docker sandbox vs SSH/OpenShell vs no sandbox

ModeWhat it is forWhat you payCommon mistake
Docker sandboxHard isolate untrusted command lines or scripts and reproduce a “clean” worldImage size, pull or build, UID or GID and volume alignmentAssuming host tools inherit automatically
SSH or remote shellRun directly on a trusted remote Mac (dedicated build box) with a full tool chainLarger surface area and a heavier account modelTreating the laptop as the sandbox, skipping audit
No sandbox / all localInternal network, low risk, fast triageWhen model-ran code is executed, the risk is highLeaving it off in CI and production for good

Where to configure: agents.defaults.sandbox.docker and sandbox-setup.sh

Conceptually there are two layers: which sandbox you use (image and runtime) and where the image comes from. The repo usually ships a scripts/sandbox-setup.sh (exact name for your clone) to pre-build or pull an image the host docker CLI can see. In config, agents.defaults.sandbox.docker.image is the image that runs the session. When they do not line up, typical errors are manifest not found or pull access denied even though the Gateway believes the sandbox is on.

info

How does this differ from our other posts? Stuck on OPENCLAW_IMAGE or Control UI? Start with the official one-click Docker and GHCR image runbook. Stuck on data volume, permissions, Skills paths, and persistence, use the Docker volume and permissions checklist. This one focuses on the sandbox sub-system and OOM/137 only.

docker.user and the OPENCLAW_HOME mount: match symptoms, not random chmod 777

The recommended order:

  • On the host, check the workspace you mount (owner, group) and the identity the Gateway process actually uses.
  • Set docker.user in config to a UID:GID that can write the volume (per your docs; in some versions the key is nested under sandbox.docker).
  • If policy forbids chown of the whole repo, prefer adjusting a subdirectory that is mounted in instead of the whole disk.

On Linux and remote Macs, named vs bind mounts and SELinux labels (on some distros) amplify the “look right in Compose, wrong in the container” class of bugs. Grab the first I/O error line from Gateway and sandbox start logs, not random retries in the chat model.

Shortest triage for “tool on host, missing in sandbox”: four questions

  1. Is the binary under a directory on the image PATH? Use docker exec into the sandbox and run command -v; that beats a verbal guess.
  2. Does it need a shared library or interpreter path (a specific Node or Python minor) that the image does not have?
  3. Do you need a read-only bind of a host path such as ~/.cache to work? If so, prefer baking cache into the image or a shared layer, not ad hoc host paths through the border.
  4. Is the tool subject to the same network policy as the Gateway and need extra HTTP(S)_PROXY? Network errors are often reported as command not found.

OOM and exit 137: “killed” and “out of memory” in Docker

137 = 128 + 9 (SIGKILL) often points to the OOM killer, but you may also see a human docker kill or a cgroup memory cap. In triage, at least: note whether the failure is in build (docker build vs docker run); on the engine, whether concurrent builds are filling Docker Desktop or the VM; on a remote Mac, look at real free memory and swap, not a single container limit. Another class is exit 1 or 2 with lots of Cannot allocate memory in text; track that in the same bucket as “memory or temp disk,” not endless model retuning only.

For “OOM during build,” a common mitigation order: reduce build parallelism (for example --parallel), use multi-stage builds to lower compile peaks, raise the Docker VM allocation, or use a larger-memory dedicated remote Mac tier as a stable base. That is a different class of event from “I got lucky on a laptop with several IDEs, a local model, and Docker at once.” If the team already uses shared storage, also watch for metadata or small-file storms that feel like OOM: split volumes or change cleanup.

Six steps: from zero to a reliable sandboxed tool run

  1. Confirm Docker and Compose are healthy and the Gateway can start; see the Gateway and model triage and how to use doctor runbook.
  2. Run the project’s sandbox-setup to build or pull, and log the image name and id to match the config.
  3. On a test workspace, turn the sandbox on and do one “write a small file” and one “read a large tree,” and capture any permission error.
  4. A/B on docker.user and mounts: only change one variable, keep a diff, so you can roll back if three things move at once.
  5. Run a mid-weight build while you watch host and container memory and log whether 137 returns.
  6. Post the runbook in the team wiki with a minimal repro, a log clip, and rules for when the sandbox is off or must stay on.
config snippet (illustrative; match openclaw.json and docs in your build)
{
  "agents": {
    "defaults": {
      "sandbox": { "mode": "docker" },
      "docker": { "user": "1000:1000" }
    }
  }
}
// Your project may nest this under sandbox.docker; run openclaw doctor and check upstream docs

Three items for review or a capacity one-pager

  • 137 and 128+ signals: in the ticket, also log docker events or engine log and host memory pressure so a business OOM is not conflated with a parallel docker build OOM.
  • User namespace and UID mapping on a bind: chown -R 1000:1000 on one project directory and “re-home the whole tree” are different risk levels; the change should have a ticket and a rollback point.
  • Cold-start latency of the sandbox: large image extract, the first pip index, and repeat downloads with no durable layer make “passed once” very different from “passes a hundred times”; size first start for capacity sign-off.

Why a laptop with Docker, a big model, and Xcode often does not keep up with enterprise sandbox cadence

That topology usually loses on concurrent memory and I/O peaks and on a repeatable environment boundary. Yesterday it built; today a peak in traffic hits 137 and write amplification at the same time. Teams that need a 7×24 Gateway and sandbox get more control from dedicated Apple Silicon cloud, clear disk and log policy, and predictable terms. MACCOME remote Mac cloud fits the long-running OpenClaw base: you spend less time fighting a random local setup and more on the toolchain and delivery.

Frequently asked questions

The sandbox cannot find a command I installed on the host. What should I do?

Install the tool in the sandbox image or add layers; or mount more directories and refresh PATH by policy. This is similar in spirit to MCP and Skills verification at the Gateway, but the boundary is the container.

If I see exit 137, is it always OOM?

Not always. Distinguish cgroup limits, system OOM, and a manual kill. Capture a slice of engine and system memory in the same window. For a daily workflow, the official doctor and no-reply runbook shows how to read the logs first.

Should I turn on the sandbox or upgrade to a larger-memory Mac cloud first?

First prove with a minimal repro whether you need isolation or raw capacity. If 137 goes away when you give Docker or the host more memory, it is resources; if you must always run untrusted code, keep sandboxing and policy. For pricing, use the public Mac cloud rental rate page.