2026 OpenClaw multi-Agent scheduling: `sessions_spawn` runtime=acp vs subagent — decision matrix, streamTo misconfig triage и ACP 1008 handshake Runbook

~18 мин · MACCOME

Вызываете sessions_spawn из main Agent — получаете ACP_TURN_FAILED, invalid handshake 1008 или queue owner unavailable, при этом прямой чат на том же Gateway жив. Статья закрывает: ① когда runtime=acp vs runtime=subagent;② почему streamTo/resumeSessionId legal только на acp и как triage'ить misconfig на subagent;③ subagent fallback при ACP handshake fail + Windows OPENCLAW_ACPX_RUNTIME_STARTUP_PROBE;④ autofill streamTo under Completions vs Responses. Дополняет upgrade escort ACP triage и Docker subagent 1008 pairing — фокус только на runtime selection для multi-Agent spawn.

Шесть типичных misread'ов (до смены runtime)

  1. Default runtime=acp без ACP bridge: main channel OK, spawn → ACP_TURN_FAILED — acpx не зарегистрирован или queue owner offline, не quota модели.
  2. streamTo/resumeSessionId в runtime=subagent: поля только для ACP session resume; subagent = embedded Gateway RPC — invalid params или silent drop, симптом «sub-agent без контекста».
  3. 1008 = всегда Docker pairing: Compose → trustedProxies runbook; local acp 1008 чаще version skew handshake или bridge race.
  4. Spawn test без reload после upgrade: CLI новый, Gateway старый процесс — split acp/subagent stacks. Сначала acceptance ladder.
  5. Игнор acpx startup probe на Windows: provider extensions тормозят cold start — handshake до ready bridge, invalid handshake в логах. OPENCLAW_ACPX_RUNTIME_STARTUP_PROBE.
  6. Mix Completions/Responses без проверки streamTo autofill: 2026 router infer'ит streamTo на Responses, не на Completions — кажется «spawn сломан», на деле API shape + runtime mismatch.

В 2026 upstream sessions_spawn — два mutually exclusive runtime: runtime=acp через ACP bridge к acpx, streamTo для backflow в main UI; runtime=subagent — lightweight sub-agent inside Gateway, lower latency, no acpx, без ACP resume fields. Wrong runtime + wrong fields = самый дорогой on-call «fake complexity» — выровняйте path, потом model/allowlist (tools.profile triage).

Каждый spawn — scheduling contract в ticket: runtime, нужен ли UI backflow, Completions vs Responses на main session, fallback (subagent/rollback). Без контракта «вчера spawn работал» не объяснить — diff обычно в template fields/API shape, не в Gateway PID.

Уже на сайте Эта статья Не дублируем
Upgrade escort ACP triage acp vs subagent при spawn + fallback backup create, full probe ladder
Docker subagent 1008 Граница subagent vs pairing 1008 Compose trustedProxies step-by-step
tools.profile triage «Spawn OK, sub-agent без tools» allowlist layering runbook
SSH dedicated Gateway acpx + subagent на remote Mac port-forward, launchd детали

runtime=acp vs runtime=subagent: какой stack когда

Rule of thumb: UI backflow, resume ACP session, align с Cursor/IDE acpx → acpGateway closed loop, low deps, acp down → subagent. Четыре prod shapes (~80% tickets):

Task shape Runtime Key params Избегать
Sub-agent output stream в main chat acp streamTo → main session; opt resumeSessionId subagent + streamTo
Background batch, no UI subagent task + timeout; no streamTo Force acp → bridge blast radius
queue owner unavailable subagent Ticket fallback; fix acpx reg parallel Retry acp → MTTR
Multi-container Docker, RPC OK, spawn 1008 Fix pairing/net, then subagent trustedProxies; Docker runbook Runtime switch до fix bind

streamTo / resumeSessionId: acp-only misconfig triage

«Params выглядят OK, sub-agent пустой» — почти всегда field/runtime cross-contamination. Gateway на subagent path strip/reject ACP fields; template с Completions streamTo:"main" + runtime=subagent → generic RPC error, не «invalid field» — смотрите call JSON.

На runtime=acp: resumeSessionId resume acpx session; streamTo направляет token stream в Control UI/channel. 2026 Responses router autofill streamTo при bound Responses session; Completions — нет. Migrate Responses→Completions без template change → «раньше backflow, теперь sub-agent тихо в фоне» — не upgrade bug, API+runtime combo.

Field-strip experiment: copy fail JSON, drop streamTo/resumeSessionId, runtime=subagent. Instant OK → ACP path/misconfig, не task/model. Else pairing/Token/tools surface. Step 4 runbook — не крутите acp logs часами.

json
// ✅ acp: UI backflow + optional resume
{
  "tool": "sessions_spawn",
  "runtime": "acp",
  "task": "Research competitor pricing, output table",
  "streamTo": "main",
  "resumeSessionId": "acp-sess-abc123"
}

// ✅ subagent: Gateway loop, no streamTo
{
  "tool": "sessions_spawn",
  "runtime": "subagent",
  "task": "Batch rename files in logs/"
}

// ❌ classic misconfig
{
  "runtime": "subagent",
  "streamTo": "main"
}

ACP handshake triage: ACP_TURN_FAILED, 1008, queue owner

runtime=acp fail, main chat OK — route по log fingerprint. Не смешивать с «Gateway dead» (channel/model runbooks).

Log / symptom Suspect first Action
ACP_TURN_FAILED acpx not ready; turn timeout Windows OPENCLAW_ACPX_RUNTIME_STARTUP_PROBE=1; bridge reg
invalid handshake / WS 1008 CLI/Gateway/acpx version split Pin versions; single reload; pin matrix
queue owner unavailable ACP bridge reg lost (2026.3.x window) host acpx; temp runtime=subagent
subagent тоже 1008 pairing/token/net (not ACP-only) Docker 1008 runbook
spawn OK, sub-agent no tools tools.profile / agent override tools.profile triage
warning

Fallback policy: acp fail два тура подряд (post-reload retest), subagent minimal probe pass → ticket «temp runtime=subagent», restrict UI-backflow tasks до fix acpx. Orthogonal к digest rollback: fallback = SLA, rollback = known regression.

Windows + OPENCLAW_ACPX_RUNTIME_STARTUP_PROBE

Provider extensions + Defender на Windows растягивают acpx cold start; spawn до ready bridge → ACP_TURN_FAILED или invalid handshake. OPENCLAW_ACPX_RUNTIME_STARTUP_PROBE=1 (semantics — docs вашей версии) — extra health wait перед first spawn. Still fail → same-box subagent minimal task validates scheduling stack, isolate acpx install/ACL — не путать slow Windows boot с mandatory OpenClaw downgrade.

Upgrade escort = post-migration probe/ACP regression; эта статья = stable version, spawn layer mis-pick или flaky handshake. Ladder pass first — иначе split-brain маскируется под streamTo misconfig.

powershell
# Windows: extend acpx startup probe
$env:OPENCLAW_ACPX_RUNTIME_STARTUP_PROBE = "1"
openclaw gateway status
openclaw gateway probe

# Minimal subagent probe (no streamTo)
# Control UI or CLI: equivalent sessions_spawn, runtime=subagent

acp broken → subagent first: fix vs pin vs fallback matrix

Blast radius Keep fixing acp Temp subagent Pin/rollback
Only spawn/acp red; main + subagent probe OK acpx reg, startup probe Default: no-UI tasks → subagent If known regression window
acp + subagent fail; probe red Only after backup restore Not first choice First backup or digest rollback
Must UI backflow; acp dead Business hurt before bridge fix Can't replace streamTo Rollback or remote Mac dedicated Gateway

7-step runbook: pick — probe — spawn — triage — fallback — metrics

  1. Freeze call JSON: runtime, streamTo/resumeSessionId, Completions vs Responses.
  2. Gateway baseline: status + probe; post-upgrade single reload (escort).
  3. Self-check pick: UI backflow → acp+streamTo; background → subagent, strip ACP fields.
  4. Minimal spawn probe: subagent read-only one-liner first.
  5. acp lane: acpx + bridge; Windows probe env; capture 1008/queue owner.
  6. Fallback/rollback: acp 2 fails → ticket subagent; both paths dead → digest rollback.
  7. Close metrics: spawn success by runtime, fallback %, MTTR; update internal runtime memo.

Three ticket metrics

  • spawn success (by runtime bucket): acp < 95% при healthy subagent bucket → fix bridge, not swap model.
  • acp→subagent fallback %: UI-backflow tasks не должны жить на fallback; >30% week → pin version или topology move.
  • streamTo misconfig events: streamTo/resumeSessionId при runtime=subagent — prod target 0 (lint/templates).

Notebook Gateway + acpx + sleep + multi-provider plugins → «OpenClaw unstable». Better: authoritative Gateway + acpx на always-on remote Mac, local только SSH forward Control UI — spawn/probe same node, aligned log timeline.

Итог: runtime = scheduling contract, не magic tuning

Крутить prompt без проверки runtime/streamTo match — раздувает ACP handshake до «multi-Agent down». Decision matrix + misconfig triage + subagent fallback в runbook сжимают on-call с ночи blind tries до probed, fallback'd, metered minute incidents.

Hard-push acp bridge на Windows laptop или split Docker topology — три hidden costs: race fake 1008, Completions/Responses streamTo drift, overlapping failure domains. Prod Gateway 7×24, ticketed spawn, acp/subagent switchMACCOME Mac mini M4/M4 Pro + six-region lease обычно дешевле total cost, чем queue owner на clamshell. Regional guide + SSH runbook для topology.

FAQ

streamTo и resumeSessionId при runtime=subagent?

Нет — только runtime=acp. subagent: task и embedded fields. Template review: цены аренды.

ACP 1008 или queue owner unavailable — обязателен rollback?

Не обязательно — symptom table; temp runtime=subagent + Windows probe. Dual-path fail → digest rollback; access → центр помощи.