2026 Hermes Agent Skills Advanced Guide: SKILL.md, Bundles, GEPA Evolution & Tap Publishing

~25 min read · MACCOME

If you run Hermes Agent on a Gateway host and keep re-teaching the same deploy, research, or publishing workflows every session, this guide is the 2026 production playbook for skill bundles, GEPA evolution, and tap publishing—not another generic SKILL.md tutorial. You will get: how Skills act as procedural memory on a 160k+ star agent that compounds, YAML bundle design, conditional activation, team tap distribution, and a ten-step rollout path. Structure: six pain points, Skills vs Memory vs Prompts matrix, L0/L1/L2 loading, bundles and hub repos, GEPA guardrails, plugin skills, blog workflow case study, FAQ.

Six pain points: why Hermes Skills fail to compound on a sleeping laptop

Hermes Agent (Nous Research) crossed 160k+ GitHub stars in mid-2026 with a positioning line teams quote constantly: an agent that grows with you. Skills are its procedural memory layer—markdown runbooks the agent writes, loads on demand, and evolves across sessions. Yet operators still hit these walls:

  1. Treating Skills like chat prompts: pasting multi-step SOPs into Telegram instead of saving them as SKILL.md files under ~/.hermes/skills/, so nothing survives session restarts.
  2. Confusing Memory with Skills: MEMORY.md stores facts and preferences; Skills store repeatable procedures. Mixing them bloats context and breaks routing.
  3. Ignoring bundle precedence: naming a bundle research when a standalone skill already uses that slug—bundles win collisions by design, which surprises teams who did not read the YAML rules.
  4. Installing hub skills without audit: community taps run security scans, but --force overrides caution findings; dangerous verdicts still block install.
  5. Skipping GEPA guardrails: running evolve_skill against production trees without pytest gates or human PR review risks semantic drift in live Gateway sessions.
  6. Skills without 24/7 host: Gateway, cron, and GEPA batch jobs need uptime; a MacBook lid closes and procedural memory stops compounding—the same constraint covered in our Hermes install guide and persistent memory architecture post.

One-line definition: Hermes Skills are versioned, searchable procedure documents—compatible with agentskills.io—that load only when a task matches, while Memory answers "what do I know about you?" and prompts answer "what do I say right now?"

Skills vs Memory vs Prompts: decision matrix and mnemonics

Use this table before you author another 400-line system message. Hermes exposes all three surfaces; picking the wrong layer is the most common source of token waste on Gateway hosts.

Dimension Memory (MEMORY.md / providers) Skills (SKILL.md) Prompts (per-turn instructions)
Stores Facts, preferences, project context Multi-step procedures, pitfalls, verification Immediate task framing for this message
Load pattern Session-long or provider-managed recall Progressive disclosure L0/L1/L2 Injected every turn if kept in system prompt
Who writes it Agent + user edits via memory tools Human author, hub install, or skill_manage User or developer per conversation
Token cost Grows with remembered history ~3k tokens for full index at L0; body on demand Fixed overhead if always attached
Mnemonic "What happened" "How we do it" "Do this now"

Pair Skills with MCP rather than replacing it. MCP supplies live tool access; Skills tell Hermes when to call web_search, terminal, or a custom plugin tool. For Cursor-side authoring of the same SKILL.md standard, see our Agent Skills guide.

SKILL.md format, directory layout, and Progressive Disclosure (L0 / L1 / L2)

Hermes keeps a single source of truth at ~/.hermes/skills/. Category folders (for example mlops/axolotl/) are organizational; routing keys live in frontmatter.

text
~/.hermes/skills/
├── mlops/
│   ├── axolotl/
│   │   ├── SKILL.md
│   │   ├── references/
│   │   ├── scripts/
│   │   └── assets/
│   └── vllm/
│       └── SKILL.md
├── devops/deploy-k8s/
│   ├── SKILL.md
│   └── references/
└── .hub/
    ├── lock.json
    └── taps.json

Progressive disclosure maps cleanly to three API levels:

  • L0 — Index (~3k tokens): skills_list() returns name, description, and category for every discoverable skill. The agent decides relevance before loading bodies.
  • L1 — Activation: skill_view(name) returns full SKILL.md plus metadata when a task matches.
  • L2 — Deep pull: skill_view(name, path) fetches a specific file under references/, templates/, or scripts/ only when a step requires it.

Every installed skill is also a slash command: /github-pr-workflow, /plan, or /axolotl in CLI and Telegram alike.

markdown
---
name: deploy-runbook
description: Use when deploying services, rolling back releases, or posting to #deploy Slack.
version: 1.2.0
platforms: [macos, linux]
metadata:
  hermes:
    tags: [deployment, runbook]
    category: devops
---

# Deploy Runbook

## When to Use
User mentions deploy, rollback, production push, or staging validation.

## Procedure
1. Run `scripts/preflight.sh` and capture exit code.
2. Execute deploy script with explicit environment argument.
3. Verify health endpoint; on failure follow Rollback section.

## Pitfalls
- Missing env vars cause silent partial deploys.
- Never skip staging confirmation for production.

## Verification
Health URL returns 200 and error rate stays below baseline for 5 minutes.

Skill bundles: YAML schema, examples, and precedence rules

Bundles group skills under one slash command. They live in ~/.hermes/skill-bundles/*.yaml and do not install missing skills—they alias skills that must already exist locally or in external_dirs.

yaml
# ~/.hermes/skill-bundles/research-session.yaml
name: research-session
description: Deep research — search, summarize, cite, and file notes.
skills:
  - duckduckgo-search
  - web-design-audit
  - excalidraw
instruction: |
  Start with source discovery, produce a cited summary,
  then sketch findings in Excalidraw if visuals help.
yaml
# ~/.hermes/skill-bundles/mlops-deploy.yaml
name: mlops-deploy
description: Train, serve, and monitor — axolotl + vLLM + deploy-k8s.
skills:
  - mlops/axolotl
  - mlops/vllm
  - devops/deploy-k8s
instruction: |
  Confirm dataset path and GPU quota before fine-tune.
  Serve with vLLM only after eval metrics pass threshold.

Precedence: if a bundle slug collides with a standalone skill name, /research-session invokes the bundle. Missing skills in the list are skipped with a note—non-fatal by design. Bundles do not invalidate the prompt cache; they inject a fresh user message like any /skill-name invocation.

bash
hermes bundles create research-session \
  --skill duckduckgo-search \
  --skill excalidraw \
  -d "Deep research bundle"

hermes bundles create mlops-deploy \
  --skill mlops/axolotl \
  --skill mlops/vllm \
  --skill devops/deploy-k8s \
  --force

hermes bundles list
hermes bundles show research-session
/bundles                    # inside chat

Conditional activation: four rule types and platform-aware skills

Hermes can hide or show skills based on available toolsets and tools—critical for fallback skills that should appear only when premium APIs are absent.

Frontmatter field Behavior
fallback_for_toolsets Hidden when listed toolsets are available; shown when they are missing
fallback_for_tools Same logic at individual tool granularity
requires_toolsets Hidden until listed toolsets are present
requires_tools Hidden until listed tools are present

The built-in duckduckgo-search skill sets fallback_for_toolsets: [web]. When FIRECRAWL_API_KEY is configured, the web toolset is available and Hermes prefers web_search; without it, DuckDuckGo appears automatically. Add platforms: [macos] or [macos, linux] to hide macOS-only skills (iMessage, Apple Reminders) from Linux Gateway nodes.

Skills Hub: install commands and four starter repos

The hub aggregates official optional skills, skills.sh, well-known endpoints, GitHub taps, and community marketplaces. All hub installs pass a security scanner; use --force only after manual review of non-dangerous findings.

bash
hermes skills browse
hermes skills search kubernetes
hermes skills inspect openai/skills/k8s
hermes skills install official/security/1password
hermes skills install skills-sh/vercel-labs/agent-skills/vercel-react-best-practices --force
hermes skills tap add myorg/hermes-skills
hermes skills check
hermes skills update
Repo / source Role Typical install
awesome-hermes-skills Community-curated index of Hermes-compatible SKILL.md collections hermes skills tap add <owner>/awesome-hermes-skills
hermeshub Team-maintained tap with internal runbooks and skills.sh.json groupings hermes skills tap add <org>/hermeshub
ai-agent-skills Cross-tool skill packs (Hermes + Cursor + Claude Code) under skills/ hermes skills install <owner>/ai-agent-skills/<skill-dir>
official (optional-skills/) Hermes-shipped skills with built-in trust—security, migration, productivity hermes skills install official/security/1password

Publishing a Skill Tap: repo layout, skills.sh.json, and team deploy

A tap is a GitHub repository—no registry signup. Hermes lists subdirectories under the tap path (default skills/) and probes each for SKILL.md.

text
my-org/hermes-skills/
├── skills/
│   ├── deploy-runbook/
│   │   ├── SKILL.md
│   │   ├── references/
│   │   └── scripts/
│   └── incident-response/
│       └── SKILL.md
├── skills.sh.json
└── README.md
json
{
  "$schema": "https://skills.sh/schemas/skills.sh.schema.json",
  "groupings": [
    { "title": "Platform", "skills": ["deploy-runbook", "incident-response"] },
    { "title": "Research", "skills": ["literature-review", "citation-check"] }
  ]
}

Team rollout is deliberately boring—in a good way:

bash
# On each Gateway host (or golden AMI):
hermes skills tap add my-org/hermes-skills
hermes skills search deploy
hermes skills install my-org/hermes-skills/deploy-runbook
hermes skills audit

# Pin non-default tap paths in ~/.hermes/.hub/taps.json:
# {"repo": "my-org/platform-docs", "path": "internal/skills/"}
info

Tip: set GITHUB_TOKEN in ~/.hermes/.env to raise GitHub API limits from 60 to 5,000 requests per hour during bulk tap indexing.

GEPA + DSPy: five-stage flow, commands, guardrails, and roadmap

The hermes-agent-self-evolution repo applies GEPA (Genetic-Pareto Prompt Evolution, ICLR 2026 Oral) through DSPy to mutate SKILL.md text—not model weights. Budget roughly $2–10 per optimization run via API calls.

Five-stage flow:

  1. Read current skill / prompt / tool description from HERMES_AGENT_REPO.
  2. Generate evaluation dataset (synthetic or from session DB).
  3. GEPA proposes candidate variants using execution traces—diagnosing why failures happen.
  4. Constraint gates filter candidates (tests, size, semantics, cache rules).
  5. Best variant opens a human-reviewed PR—never a direct commit to production.
bash
git clone https://github.com/NousResearch/hermes-agent-self-evolution.git
cd hermes-agent-self-evolution
pip install -e ".[dev]"
export HERMES_AGENT_REPO=~/.hermes/hermes-agent

# Synthetic eval set
python -m evolution.skills.evolve_skill \
  --skill github-code-review \
  --iterations 10 \
  --eval-source synthetic

# Mixed traces: Hermes session DB + Claude Code + Copilot exports
python -m evolution.skills.evolve_skill \
  --skill github-code-review \
  --iterations 10 \
  --eval-source sessiondb

Four production guardrails (plus mandatory PR review):

  • Test suite: pytest tests/ -q must pass at 100% before merge.
  • Size limits: skills stay at or under 15 KB; tool descriptions at or under 500 characters.
  • Caching compatibility: no mid-conversation prompt mutations that break Hermes cache assumptions.
  • Semantic preservation: evolved text must not drift from the skill's original purpose.

Five-phase roadmap in the self-evolution repo: Phase 1 Skills (implemented), Phase 2 tool descriptions, Phase 3 system prompt sections, Phase 4 tool implementation code via Darwinian Evolver, Phase 5 continuous automated pipeline.

Plugin skills: plugin:skill namespace and plugin.yaml

General plugins under ~/.hermes/plugins/ ship a plugin.yaml manifest and register bundled skills via ctx.register_skill(name, path). Loaded skills appear namespaced as plugin:skill and are retrieved with skill_view("plugin:skill").

yaml
# ~/.hermes/plugins/acme-tools/plugin.yaml
name: acme-tools
version: "1.0"
description: Internal CRM and billing tools for Acme ops
requires_env: [ACME_API_KEY]

Enable third-party plugins explicitly in config.yaml under plugins.enabled—discovery alone does not execute arbitrary code. Project-local plugins in ./.hermes/plugins/ require HERMES_ENABLE_PROJECT_PLUGINS=true.

Advanced authoring: description, pitfalls, scripts, limits, skill_manage

  • Description is the router: write user phrases ("rollback production", "fine-tune Llama") not feature summaries.
  • Pitfalls section is mandatory discipline: document failure modes the agent already hit once—this is where procedural memory pays off.
  • Scripts belong in scripts/: deterministic commands reduce hallucinated flags; only stdout/stderr re-enters context.
  • GEPA enforces 15 KB skill bodies: move long schemas to references/ and link from short steps.
  • skill_manage actions: prefer patch over edit for token-efficient updates; gate writes with skills.write_approval: true and review via /skills pending.

Case study: MACCOME blog workflow bundle

Editorial teams publishing multilingual MACCOME posts can codify a repeatable bundle instead of re-pasting generate-blog.md rules into Telegram every sprint:

yaml
# ~/.hermes/skill-bundles/maccome-blog.yaml
name: maccome-blog
description: Draft MACCOME blog HTML — structure, EEAT data, FAQ, CTA.
skills:
  - plan
  - web-design-audit
  - github-pr-workflow
instruction: |
  Gather topic from user, produce outline matching six modules
  (lead, pain points, table, steps, hard data, conversion).
  Verify internal links with ls before writing hrefs.
  Stage HTML on branch; never commit without explicit user approval.

Invoke with /maccome-blog Hermes GEPA bundles tap on a 24/7 Mac Mini M4 node. Field tests in our 30-day Hermes rental report showed skills growing from 3 to 19 with roughly 38% token savings on repeat tasks—bundles amplify that when task profiles stay stable.

Ten-step production rollout for Hermes Skills

  1. Install Hermes on an always-on macOS host (see install guide); confirm hermes doctor passes.
  2. Audit bundled skills: hermes skills list; opt out with hermes skills opt-out if you want a blank slate.
  3. Author or import one high-frequency SKILL.md with trigger-first description.
  4. Create a bundle for tasks that always need the same skill set (research-session, mlops-deploy).
  5. Add conditional fields for fallback tools and platforms if you run mixed macOS/Linux fleets.
  6. Subscribe to team tap with hermes skills tap add; ship skills.sh.json for hub categories.
  7. Enable write approval if Gateway is public-facing: skills.write_approval: true.
  8. Schedule GEPA runs against staging skills with --eval-source sessiondb for mixed traces.
  9. Regression-test slash commands on Telegram and CLI; tune descriptions from real operator phrases.
  10. Commit tap repo + bundle YAML to Git; symlink skill-bundles/ from dotfiles for fleet consistency.

Three cite-worthy data points (June 2026)

  • Star velocity: Hermes Agent surpassed 160k GitHub stars within months of its February 2026 launch—faster than most open-source agent frameworks, signaling production interest beyond IDE copilots.
  • GEPA sample efficiency: reflective evolution typically needs 100–500 evaluations versus 10,000+ for RL-style prompt tuning (per ICLR 2026 GEPA paper)—making nightly skill optimization feasible on API budgets.
  • Field token delta: MACCOME's 30-day rented Mac Mini M4 diary documented skills 3 to 19 with ~38% lower token use on repeated workflows once procedural memory covered deploy and research paths.

Resources

Closing: procedural memory needs hardware that stays awake

Skill bundles, tap publishing, and GEPA evolution only compound when Hermes runs continuously. The limits of alternatives are concrete: (a) laptop sleep breaks Gateway and nightly GEPA jobs; (b) Linux VPS lacks macOS-only skills and Apple toolchain steps; (c) ad-hoc prompts never survive session rotation or team onboarding.

When you need SSH in minutes, predictable monthly cost, and a macOS host where ~/.hermes/skills/, taps, and launchd Gateway survive 24/7, a MACCOME dedicated Mac Mini M4 cloud node is usually the better production fit. Compare memory tiers on the Mac Mini rental rates page; operations questions go to the cloud Mac support center.

FAQ

What is the difference between Hermes Skills and MCP?

MCP connects external APIs, databases, and browsers at runtime. Hermes Skills are procedural documents describing when and how to complete workflows. Reference MCP tool names inside Skills; do not treat MCP as a substitute for runbooks.

Why is my skill not updating after hermes update?

Edited bundled skills are marked user-modified in .bundled_manifest and skipped on sync. Run hermes skills reset <name> to re-baseline, or hermes skills reset <name> --restore for the pristine upstream copy.

Is GEPA skill evolution safe for production?

Candidates must pass pytest, 15 KB size limits, caching rules, and semantic-preservation gates. Changes land via reviewed PRs in the self-evolution repo—never direct writes to live Gateway skill trees without approval.

Can I reuse Hermes skills in Claude Code or Cursor?

Yes. Copy skill directories to .agents/skills/ or .cursor/skills/. Tune description triggers per editor. The agentskills.io format is the portability layer Hermes adopted deliberately.

How do I reduce token burn on Chinese-heavy Hermes sessions?

Keep L0 metadata lean, offload long text to references/, avoid oversized bundles, and run GEPA with mixed trace sources so descriptions stay precise. For 24/7 Gateway hosts that run CJK editorial workflows, see MACCOME Mac Mini rental rates.