2026 OpenClaw Repository Context & Skills
AGENTS.md, Bootstrap Injection & memory_search Tuning Checklist

About 24 min read · MACCOME

Teams already running a Gateway usually hit the wall not because “the model is asleep,” but because AGENTS.md, Skills, and bootstrap injection bloat the context, then memory_search and daily memory/*.md files mix unpredictably—making it hard to see which layer is burning budget. This article delivers six engineering-friction points, a promptMode versus bootstrap matrix, three log-aligned context metrics, a minimal AGENTS.md sample, and a six-step tuning runbook. Read it with the multi-platform install guide, post-install doctor triage, provider routing and failover, Docker networking triage, and channel setup—those cover won’t start and won’t connect; this one covers runs but context is uncontrolled.

Six ways a “repo agent” goes off the rails within two weeks

OpenClaw in 2026 typically injects a bundle of repository files at startup (for example AGENTS.md, Skills, identity and tool boundaries) and splits memory into bootstrap versus on-demand search. Without an agreement on what belongs in bootstrap versus memory_search, teams quickly see every turn carrying a huge static prefix, longer tool chains, and rising cost and latency. Track the six items below beside weekly token estimates, tool failure rates, and average turns.

  1. AGENTS.md keeps growing: product history, roadmaps, and ops manuals in one file inject at bootstrap and tax every turn; split “always needed” from “retrieve on demand.”
  2. Skills folder lacks indexing strategy: overlapping Skills make models hesitate between tools; maintain Skill granularity and mutual exclusion notes.
  3. promptMode stays on full forever: sub-agents and batch jobs still carry the maximal system prompt; move eligible workloads to minimal or none per docs.
  4. Treating memory files like a database: pasting huge logs into memory/*.md then scanning with memory_search raises IO and embedding cost; use summaries in-repo, raw text externally.
  5. Confusing context with provider routing: when context explodes, teams swap models first; tighten bootstrap and tool loops, then use the provider article for failover.
  6. Shared repos without ownership fields: who edits AGENTS, who approves Skill PRs, who prunes memory—without answers, laptops and remote Macs drift.

Align these six with the Docker article’s split between healthy Gateway versus model-layer failures to see faster whether you have a context policy issue or a network/provider issue.

Matrix: promptMode, bootstrap, and on-demand memory

Use the table in reviews: the goal is a predictable token ceiling per task shape, not prettier prose. Exact keys follow your OpenClaw version.

DimensionBootstrap (startup)memory_search / memory_get (on demand)
Typical contentAGENTS.md, core Skill summaries, identity and tool boundariesDated notes, decision logs, long appendices
Failure lookEvery turn is slow, expensive, noisy prefixMissed hits, overly wide hits, repeated reads
Tuning leversTotal character ceiling, staged injection, promptModeIndex granularity, naming, summarization policy
promptModefull injects the most; minimal suits sub-agents that can omit sectionsDoes not replace bootstrap; avoid hiding critical boundaries only in memory
CollaborationNeeds code review and versioningNeeds audit fields (author, expiry)
Remote MacLarge clones in the cloud—watch sync lag and permissionsHigh churn writes can race laptops

Three context metrics worth a Grafana panel or on-call note

Collect from logs and coarse token estimates; numbers are placeholders—replace with your baselines.

  1. Bootstrap effective load (BEL): equivalent character weight of static prefixes before the first user message each session; if BEL stays above your threshold, split files and tighten ceilings such as agents.defaults.bootstrapTotalMaxChars (name varies by release—follow official docs).
  2. Tool loop index (TLI): consecutive no-progress tool calls inside one task; rising TLI usually signals unclear Skills or noisy context, not HTTP 429.
  3. Memory retrieval hit rate (MRH): fraction of memory_search results the model actually uses (spot-check or secondary confirmation). Low MRH means index or summary debt; high MRH with high BEL often means duplicated content.

Multi-model routing and dynamic context estimation will keep evolving through 2025–2026, but repository-side noise still caps quality—keep BEL, TLI, and MRH on the board.

When explaining to non-developers, BEL is like “pages from the employee handbook you reread before every conversation,” and TLI is “how many times the same question gets escalated without progress”—growing model size without fixing those only prints the same thick booklet on more expensive paper.

markdown
# AGENTS.md (example: only eternal boundaries; details live in Skills or memory)
## Repository role
- This repo is the xxx service; default branch main; release cadence in docs/release.md.

## Tooling boundaries
- Do not change CI secrets or production configs without review; DB migrations need two-person sign-off.

## OpenClaw conventions
- Skills directory: .openclaw/skills/ (example—use your layout)
- Long decisions: memory/YYYY-MM-decisions.md with a one-line summary at the top.
warning

Warning: Do not place customer names, raw secrets, or full unredacted logs inside bootstrap text; if retention is mandatory, use on-demand retrieval with scoped visibility.

Six-step runbook from “it runs” to “we can maintain it”

  1. Freeze versions and docs: record OpenClaw and Gateway versions; open upstream docs for system prompt, bootstrap, and memory—avoid folklore.
  2. Inventory injection sources: list bootstrap files and Skills, tagging must-have, optional, or should-move-to-memory.
  3. Set a BEL ceiling: agree on a hard cap and rollback; when over budget delete repetition before adding models.
  4. Calibrate promptMode: move sub-agents and batch jobs to minimal or equivalent and log before/after latency, success, and cost.
  5. Normalize memory: convert long text to summary-plus-pointer; archive raw logs to cold storage outside the hot search path.
  6. Layered review with provider/Docker: if TLI stays high, continue into the provider and Docker articles—do not expand the model pool before context is clean.

Gateway, models, and repo context: assign blame in order

Intermittent timeouts tempt teams to swap models or add GPUs; without BEL and TLI you will confuse noisy prompts with unstable infrastructure. Follow the Docker article: Gateway and channels first, then provider, then repository prompts—same layering as the channel article’s triage, with context budget at the top.

Add a minimal repro quartet to each incident ticket: (1) active promptMode and BEL band, (2) link to the last AGENTS/Skills merge request, (3) memory_search query and hit count, (4) Gateway log snippets adjacent to tool calls. With those four fields most “mystery slowdowns” resolve in under thirty minutes. For sub-agents or parallel jobs, log per-branch ceilings in the same workbook—aggregates that only watch the main session mislead.

Why “it worked on my laptop” is not team-scale maintenance

Personal machines mix giant prompts and ad hoc secrets inside global config—hard to audit. Moving to CI or shared remote Macs introduces permission and sync lag, so AGENTS and memory fork invisibly. Writing explicit context policy and pairing it with dedicated remote environments turns agent workflows into reviewable, handoff-ready assets.

Ephemeral cloud desktops can run OpenClaw, but long uptime, fixed paths, and low toolchain drift favor dedicated physical remote Macs—especially when a Gateway shares a host or region with a large monorepo and disk IO joins context as a bottleneck. MACCOME operates Mac mini M4 and M4 Pro nodes across Singapore, Japan, Korea, Hong Kong, and US coasts with flexible rental terms suited to always-on Gateway plus large-repo clones; align public rate pages with your BEL row, freeze directory policy, then iterate Skills.

Pilot: drive BEL into the team target band for one week before buying larger models or more channels—bigger models rarely fix repository noise.

FAQ

Which article should I read first for install issues?

Follow the multi-platform install guide, then the doctor triage; this article does not repeat port and dependency checks.

Where are networking and model topics?

See Docker networking triage for CLI reachability and provider routing for model chains.

Rental rates and help?

Open rental rates and the Help Center.