2026 Six-Region Remote Mac: Jenkins & Buildkite agent credentials (machine users, OIDC, PAT) coexisting with GitHub Actions/GitLab runner—mutex & lease FinOps checklist

About 17 min read · MACCOME

If you run Jenkins or Buildkite macOS agents on dedicated remote Mac hosts across Singapore, Japan, Korea, Hong Kong, US East, and US West, and you already operate GitHub Actions or GitLab self-hosted runners, the budget and audit pain is usually not vCPU—it is credential blast radius from a second controller: who holds a 90-day PAT, which user context owns the match decrypt keychain and ASC session, and whether burst hosts rebuild with secrets frozen into images. This article gives three ledgers—credential topology, mutex rules, lease accounting—and complements the Jenkins/Buildkite staggered-lease runbook and the runner labels & secrets checklist: those cover queues; this one covers identities, OIDC, and revocation paths.

Six credential failure modes when a "second CI controller" lands on the same six-region host

  1. All agents share one interactive macOS user HOME: ~/.ssh for Jenkins, buildkite-agent, and actions-runner under the same login turns unattended CI into flaky green builds whenever git credential-osxkeychain prompts appear.
  2. Org-wide PATs baked into daily-burst images: weekly host churn with frozen org tokens widens blast radius beyond a single builder.
  3. Missing mutex on match decrypt and notary uploads: concurrent pipelines touching one match repo or ASC API session show intermittent 401/403 misread as "Apple-side noise".
  4. OIDC audience mismatch on the cloud control plane: Buildkite or GitHub trust policies that are not narrowed to repo/environment let agents swap to a larger token surface than the job needs.
  5. Undocumented keychain partitions and LaunchDaemon order: Jenkins starts before the runner after reboot, reading an empty partition until someone clicks retry.
  6. FinOps tracks cores but not "credential lifetime × lease": long-lived keys on daily hosts look cheap on the invoice and unacceptable on the security calendar.

Six-region value is predictable colocation and dedicated IO; unclear credential topology only scales confusion. When you also run Xcode Cloud hybrid CI, you must state which identity may touch ASC versus internal registries—otherwise the hybrid matrix becomes a hybrid incident.

Introduce a credential RACI: platform owns machine users and OIDC bindings, repos own workflow scopes, security owns revoke drills. Any gap explodes once the second controller goes live.

Dimension Prefer long-lived PAT / deploy keys Prefer OIDC / short tokens
Lease Only on monthly-or-longer baseline hosts with out-of-image secret injection Eligible for daily/weekly burst if trust conditions are tight
Audit grain Requires extra logging for "who placed PAT on which host"; rotation often tracks fiscal quarters Cloud can bind repo, environment, pool—easier per-job attribution
Jenkins plugin ecosystem Many plugins assume static credential files Needs explicit pipeline refactors; higher one-time cost
Buildkite Hooks may silently export secrets Hooks should assemble only; secrets from OIDC-exchanged short tokens
GHA/GitLab runner Self-hosted runners often keep long .credentials files OIDC to cloud STS is mainstream—align first
warning

Red line: do not mint organization-wide long PATs or root deploy keys on daily burst hosts. If the business insists, bind secret lifetime to lease caps in the same approval ticket.

Six-step runbook: from "it runs" to "we can revoke and ledger it"

  1. Inventory secret surfaces across Jenkinsfile, Buildkite pipelines, and GitHub workflows: env vars and files; label org-wide versus repo-scoped.
  2. Assign distinct machine users and HOME trees—for example jenkins, buildkite, runner; forbid shared interactive logins for CI.
  3. Mutex exclusive resources: match decrypt, notary upload, ASC browser sessions must serialize; lock names live in ROUTING.md and pipeline comments.
  4. Write OIDC trust conditions as reviewable text: audience, repository, ref prefix, environment name; cross-check IAM or internal STS.
  5. Quarterly revoke drill: randomly revoke one token class and verify all three agents degrade as expected instead of silently continuing.
  6. Put credential lifetime into the lease FinOps sheet: long-lived only on baseline; burst only short STS; same page as the staggered lease table.
bash
# Example mutex (replace flock backend with your coordination service)
exec 9>/var/lock/match-decrypt.lock
flock -n 9 || { echo "match decrypt busy"; exit 42; }

# Example split users (LaunchDaemon sketch—do not copy paths blindly)
# UserName=buildkite vs UserName=runner — each HOME keeps its own git credential helper config

Three metrics to annotate in Grafana or review notes (tune to your baselines)

  • Jobs on burst hosts still reading 30+ day PATs: target 0%; if weekly sampling shows >3%, open a security exception ticket.
  • Mutex wait P95 (seconds) for match/notary steps alone; if P95 stays >600s while queue depth rises, add a serial export host before adding compile parallelism (threshold is illustrative).
  • OIDC token exchange failure rate split by control plane and region; if STS RTT rises for one six-region footprint, fix network topology before raising runner concurrency.

Why "SSH in and fix keychain by hand" or a universal frozen .env in images is worse in 2026 than skipping the second controller

Manual keychain work is unauditable: who last unlocked, whether an interactive session stayed open, cannot ship as SOC2 evidence. A universal .env widens blast radius from a single repo to anyone who can start a container—opposite of OIDC least privilege per job.

When you need Jenkins/Buildkite and runners coexisting on dedicated Apple Silicon with stable regional egress and separable baseline versus burst leases, MACCOME cloud Mac mini is usually the better physical anchor: nodes across Singapore, Japan, Korea, Hong Kong, US East and US West with flexible daily/weekly/monthly/quarterly leases. Ledger who may live on which host for how long before chasing compile throughput—never freeze a 90-day PAT on a daily host while running five simulators and three notary uploads.

Close: make CREDENTIAL_ROUTING.md the sibling of CLONE_POLICY

Ship three tables: machine user ↔ HOME map, mutex resources ↔ lock names, OIDC trust ↔ STS scopes. A new hire on day one should answer which identity a job uses, which token class to revoke on failure, and why no 90-day PAT exists on burst hosts.

When pairing with the runner secrets checklist, merge "GitHub-side OIDC" and "macOS keychain partitions" into the same change ticket—otherwise trust policies look perfect in cloud while partitions are empty on metal.

FAQ

Can Jenkins, Buildkite, and GitHub Actions runner coexist on one host?

Yes with split users, keychain partitions, and serialized exclusive steps; avoid one long-lived PAT across stacks. Queueing details live in the staggered-lease runbook. Public lease tiers: rental rates.

Why anchor long-lived secrets on monthly baseline hosts?

Daily hosts churn; static secrets leak into images or backups. Baseline hosts anchor audit identity. Ops context also lives in the support center.