How does this differ from the install and Docker production guides?

Install guides bootstrap OpenClaw and Gateway; the Docker production guide covers Compose, probes, and release cadence. This runbook assumes processes already start and focuses on MCP registration, ClawHub/local Skills placement, and separating model, tool, and channel failures in logs.

Should I check volumes before MCP config?

If Skills or state directories live on ephemeral container layers or read-only mounts, ClawHub downloads look successful until restart. Fix bind mounts and permissions first, then return here for tool registration.

How do we minimize MCP privileges?

Split read-only query servers from write/external-call servers, use separate tokens, and keep an allowlisted tool manifest in the same change record—avoid one MCP profile that exposes an entire internal API surface.

2026 OpenClaw: MCP Tooling, ClawHub Skills Install/Verify & Gateway Triage Runbook

About 13 min read · MACCOME

Audience: Gateway runs, but MCP tools never appear, calls time out, or Skills vanish after restart. Outcome: Keep bootstrap in the install guide and Docker production runbook; keep persistence in the volumes & Skills permissions article. This runbook covers declare → process visibility → Gateway registration → model/tool/channel triage. Layout: six pitfalls, two matrices, config sketch, six steps, three KPIs, closing guidance.

Why does Gateway act like MCP is missing?

MCP is a JSON-RPC session between Gateway and a child process or remote endpoint. Config entries exist ≠ child starts; child starts ≠ schemas returned. Six frequent misreads follow.

Environment only in interactive shells: daemons, systemd, launchd, or Compose never see PATH or API keys from ~/.zshrc.
ClawHub Skills on read-only or anonymous volumes: downloads look fine until the container recreates—see the volumes article.
Stale tool caches: configs changed but UI/CLI lists stay old; reload per docs instead of assuming failure.
Timeouts too tight: first cold calls across RTT need different thresholds than steady state.
Overlapping AGENTS.md / bootstrap text: duplicate instructions across MCP and Skills inflate context; split boundaries per the Skills tuning checklist.
Channel issues mistaken for MCP: fix Slack/Telegram OAuth paths before blaming tools.

Run openclaw doctor using the order in the post-install doctor guide; this article adds the tool-registration evidence chain, not another install tutorial.

Keep a one-page “minimum repro card” per MCP server: one read query, one negative test that must be denied, and three expected log tokens—on-call can compare cards to spot config regressions without rereading giant prompts. Note allowed egress and data classification on the card so incidents never widen tokens without a record.

Table 1: MCP symptom → evidence → action

Field names vary by OpenClaw version; this table locks order of operations.

Symptom	Collect first	Likely root	Auditable action
Empty/partial tool list	Gateway logs, child exit codes	Missing binary, cwd, permission denied	Use absolute `command/args/cwd`; run the child as the same user as Gateway
First call slow, then OK	Cold-start timing, package fetch logs	`npx -y` or runtime JIT	Prewarm jobs; pin versions in images; relax first-call timeout
Steady timeouts	Child alive, CPU, FD usage	Deadlock, blocking IO	Sample/trace where allowed; A/B with a read-only tool
“Tool not registered”	Schema logs, protocol version	Implementation mismatch	Align MCP versions; pin minors; read upstream changelog

Table 2: ClawHub Skills vs in-repo Skills vs MCP

Publish a capability matrix so one workflow is not described three different ways.

Source	Best for	Versioning	Risk
ClawHub / marketplace	Rapid experiments	Pin commit or semver range; weekly diff	Upstream drift—needs regression tests
Repo `SKILL.md` / private packs	Compliance-heavy flows	Ship with mainline via PR	Maintenance load; align with MCP scope
MCP (system of record)	DBs, tickets, internal HTTP APIs	Independent release cadence	Over-broad tokens—maintain allowlists

config

# Structural sketch only—real keys, nesting, and hot reload follow current OpenClaw docs.
# Goal: Gateway launches an MCP server over stdio as a fixed user.
#
# mcpServers:
#   internal-readonly-lookup:
#     command: /usr/local/bin/node
#     args: ["/opt/mcp-servers/lookup/dist/index.js"]
#     env:
#       LOOKUP_API_TOKEN: "${LOOKUP_TOKEN_READONLY}"
#
# ClawHub Skill: extract/clone into the team skills directory, then refresh the
# skill index or run the documented reload command for your version.

warning

Warning: MCP connects assistants to production data. Least privilege and audit trails beat “just make it work.” Split read vs write servers, split tokens, and attach allowlist snippets to the change ticket.

Six steps: from “chat works” to “tool calls are replayable”

Freeze topology: bare metal, remote Mac, or container—document user, PATH, cwd, and bind mounts.
Register MCP: fill command/args/env per docs; manually launch the child as the same identity as Gateway and confirm handshake logs.
Install ClawHub Skills: land on persistent storage; record version and checksum—never only the ephemeral layer.
Trim overlapping Skills text: move long retrieval to memory_search or doc tools to curb context growth.
Automate three checks: cold start, steady call, and a deliberate failure (e.g., disconnect) to validate timeouts and degradation.
Update the ops guide: reload order, rollback (remove server + restart Gateway), owners—same page as on-call.

Three KPIs worth weekly review

Registration coverage: declared MCP servers vs tools actually listed, sliced by release.
First-call P95 vs steady P95: treat warm-up separately from steady state.
Duplicate capability count: actions described in MCP, ClawHub, and AGENTS.md—anything >1 needs a signed waiver.

On remote Macs or cloud hosts, disk and log rotation affect MCP children that spill temp files to small system volumes—timeouts may look random though the model config is unchanged. Review host ops alongside tool config.

For HTTP/SSE MCP fronts, include reverse-proxy idle timeouts, Upgrade handling, and TLS termination: Gateway may log a successful handshake while the edge proxy returns 499/504. Cross-check the Nginx/Caddy reverse-proxy guide before only raising OpenClaw timeouts.

Directional community note (not a benchmark): three heavy MCP servers plus wide retrieval often produces minute-scale queue jitter—capability matrices and allowlists beat infinite plugins for SLA.

Why laptops and ad-hoc hosts struggle with long-lived tool governance

Sleep, VPN flaps, and path drift make child processes and skill indexes unpredictable. Connecting real business data demands 24/7 uptime, persistent paths, and auditable permissions.

Self-managed boxes without multi-region choice or flexible terms encourage shared hosts where cold starts and log IO contend. Placing Gateway on dedicated Apple Silicon with predictable disks and egress—typical of a professional Mac cloud—usually makes MCP and Skills policies enforceable in contracts. MACCOME offers multi-region Mac Mini M4 / M4 Pro with flexible rental terms as a stable base for Gateway and build farms; confirm public rates and help-center SLAs before ordering.

Pilot the three checks from this runbook on a remote Mac before promoting one image fleet-wide—avoid “works locally, times out in prod” loops. If Gateway is internet-facing, ship TLS, rate limits, and IP allowlists in the same change, not as a later patch.

FAQ

How does this pair with channel onboarding?

Channel guides cover Slack/Discord/Telegram OAuth; this article covers tool discovery. If messages reach Gateway but tools fail, gather evidence from Table 1 before revisiting channel “connected but silent” cases.

What should rollback include?

Remove MCP entries, document restart order, run a read-only verification query, and confirm tool counts return to baseline on dashboards. Align billing using rental rates.

Container vs bare-metal paths differ—now what?

Maintain an absolute-path matrix per runtime; never let the model guess paths in chat. Cross-check the help center with the Docker volumes article.