You already run OpenClaw Gateway on Docker / Compose, but turning on the Agent sandbox shows containers that fail, read-only or unwritable workspace, tools missing in the sandbox, and OOM exits with 137 on build. This article condenses 2026 mainline practice into an order you can run: off, on, verify, fix, with key settings, docker.user and volume permissions, a build/oom triage table and our existing Docker, volume, and doctor articles. After reading, you can tell a bad image, a mount UID problem, or a cap-and-log-signature problem apart.
Usually the root cause is the process boundary, not a broken install:
brew install on the host do not appear in the sandbox PATH unless they are in the image or you add extra mounts and layers.501:20 but the process in the container runs as 1000:1000, you often get read or write denials, logged as Permission denied or EROFS.| Mode | What it is for | What you pay | Common mistake |
|---|---|---|---|
| Docker sandbox | Hard isolate untrusted command lines or scripts and reproduce a “clean” world | Image size, pull or build, UID or GID and volume alignment | Assuming host tools inherit automatically |
| SSH or remote shell | Run directly on a trusted remote Mac (dedicated build box) with a full tool chain | Larger surface area and a heavier account model | Treating the laptop as the sandbox, skipping audit |
| No sandbox / all local | Internal network, low risk, fast triage | When model-ran code is executed, the risk is high | Leaving it off in CI and production for good |
agents.defaults.sandbox.docker and sandbox-setup.shConceptually there are two layers: which sandbox you use (image and runtime) and where the image comes from. The repo usually ships a scripts/sandbox-setup.sh (exact name for your clone) to pre-build or pull an image the host docker CLI can see. In config, agents.defaults.sandbox.docker.image is the image that runs the session. When they do not line up, typical errors are manifest not found or pull access denied even though the Gateway believes the sandbox is on.
How does this differ from our other posts? Stuck on OPENCLAW_IMAGE or Control UI? Start with the official one-click Docker and GHCR image runbook. Stuck on data volume, permissions, Skills paths, and persistence, use the Docker volume and permissions checklist. This one focuses on the sandbox sub-system and OOM/137 only.
docker.user and the OPENCLAW_HOME mount: match symptoms, not random chmod 777The recommended order:
docker.user in config to a UID:GID that can write the volume (per your docs; in some versions the key is nested under sandbox.docker).chown of the whole repo, prefer adjusting a subdirectory that is mounted in instead of the whole disk.On Linux and remote Macs, named vs bind mounts and SELinux labels (on some distros) amplify the “look right in Compose, wrong in the container” class of bugs. Grab the first I/O error line from Gateway and sandbox start logs, not random retries in the chat model.
PATH? Use docker exec into the sandbox and run command -v; that beats a verbal guess.~/.cache to work? If so, prefer baking cache into the image or a shared layer, not ad hoc host paths through the border.HTTP(S)_PROXY? Network errors are often reported as command not found.exit 137: “killed” and “out of memory” in Docker137 = 128 + 9 (SIGKILL) often points to the OOM killer, but you may also see a human docker kill or a cgroup memory cap. In triage, at least: note whether the failure is in build (docker build vs docker run); on the engine, whether concurrent builds are filling Docker Desktop or the VM; on a remote Mac, look at real free memory and swap, not a single container limit. Another class is exit 1 or 2 with lots of Cannot allocate memory in text; track that in the same bucket as “memory or temp disk,” not endless model retuning only.
For “OOM during build,” a common mitigation order: reduce build parallelism (for example --parallel), use multi-stage builds to lower compile peaks, raise the Docker VM allocation, or use a larger-memory dedicated remote Mac tier as a stable base. That is a different class of event from “I got lucky on a laptop with several IDEs, a local model, and Docker at once.” If the team already uses shared storage, also watch for metadata or small-file storms that feel like OOM: split volumes or change cleanup.
sandbox-setup to build or pull, and log the image name and id to match the config.docker.user and mounts: only change one variable, keep a diff, so you can roll back if three things move at once.{
"agents": {
"defaults": {
"sandbox": { "mode": "docker" },
"docker": { "user": "1000:1000" }
}
}
}
// Your project may nest this under sandbox.docker; run openclaw doctor and check upstream docs
docker events or engine log and host memory pressure so a business OOM is not conflated with a parallel docker build OOM.chown -R 1000:1000 on one project directory and “re-home the whole tree” are different risk levels; the change should have a ticket and a rollback point.pip index, and repeat downloads with no durable layer make “passed once” very different from “passes a hundred times”; size first start for capacity sign-off.That topology usually loses on concurrent memory and I/O peaks and on a repeatable environment boundary. Yesterday it built; today a peak in traffic hits 137 and write amplification at the same time. Teams that need a 7×24 Gateway and sandbox get more control from dedicated Apple Silicon cloud, clear disk and log policy, and predictable terms. MACCOME remote Mac cloud fits the long-running OpenClaw base: you spend less time fighting a random local setup and more on the toolchain and delivery.
Frequently asked questions
The sandbox cannot find a command I installed on the host. What should I do?
Install the tool in the sandbox image or add layers; or mount more directories and refresh PATH by policy. This is similar in spirit to MCP and Skills verification at the Gateway, but the boundary is the container.
If I see exit 137, is it always OOM?
Not always. Distinguish cgroup limits, system OOM, and a manual kill. Capture a slice of engine and system memory in the same window. For a daily workflow, the official doctor and no-reply runbook shows how to read the logs first.
Should I turn on the sandbox or upgrade to a larger-memory Mac cloud first?
First prove with a minimal repro whether you need isolation or raw capacity. If 137 goes away when you give Docker or the host more memory, it is resources; if you must always run untrusted code, keep sandboxing and policy. For pricing, use the public Mac cloud rental rate page.