Pain points that look like hardware faults
Apple Silicon stays fast while NFS attribute caches hide latency. Storms show up when Xcode, SwiftPM caches, and artifact rsync hit the same mount during promotions.
- Attribute cache drift: long actimeo serves stale metadata when Git tags move until you remount or tighten caches on compile roots.
- Soft mount timeouts: soft plus short timeo masks brownouts but risks partial writes on scratch trees unless OpenClaw excludes them.
- Unbounded rsync: parallel transfers without
bwlimitstarve SSH and webhooks; mirror caps from the artifact rsync matrix.
Decision matrix: mounts, rsync, and disk gates
| Dimension | Conservative | Aggressive | Pick when |
|---|---|---|---|
| actimeo | Low on hot prefixes | Higher on read-mostly trees | Tighten anywhere checksum gates are weak. |
| Hard versus soft | hard,bg with monitored SLO | soft,intr for disposable scratch | Hard for release binaries; soft only for throwaway caches. |
| rsync bandwidth | One stream near thirty-two megabytes per second | Two streams near forty-eight after a week green | Scale only when ionice keeps compile IO responsive. |
| Concurrency | One job per node | Two staggered jobs with jitter | Add the second lane after merged webhooks stay green. |
| One terabyte | Pause promotions above seventy percent used | Seventy-five percent briefly with on-call | Protect small APFS volumes from snapshot spikes. |
| Two terabyte | Yellow at seventy-eight percent | Red freeze at eighty-eight percent | Still audit inodes weekly even with deeper staging. |
Version the matrix in GitOps. Pair changes with the fragment merge canary workflow so OpenClaw merges health only after a canary accepts new mount fragments.
ionice thresholds for NFS clients
Wrap rsync with ionice -c2 -n4 on build hosts so bulk IO stays best effort. Use ionice -c3 only on dedicated promotion nodes without UI tests.
- Compile hosts: bulk copy at nice nineteen plus ionice class three during tests.
- Promotion hosts: class two near four for rsync while respecting bandwidth caps.
- Signals: emit rolling median read latency in OpenClaw JSON and block merges on doubled latency.
rsync backoff after webhook failures
Pause new rsync waves with exponential sleep from five to one hundred twenty seconds across three attempts. Reset counters only after merged digests succeed from every node.
Minimal reproducible rollout steps
- Step 1: Map tenants so each Mac mounts one hot NFS subtree for compile and keeps scratch on local APFS.
- Step 2: Install OpenClaw gateway, issue primary bootstrap and read only observer tokens, rotate every ninety days.
- Step 3: Label node fragments by region and pool, enable merge webhooks so probe payloads collapse before decisions.
- Step 4: Apply matrix rows during maintenance, remount, then run a short smoke compile across metadata heavy and large binary paths.
- Step 5: Add ionice and backoff to promotion scripts; confirm help center SSH guidance stays responsive under load.
- Step 6: Emit one merged digest per patrol with disk and NFS counters; block merges on red thresholds.
One terabyte and two terabyte APFS gates
Full APFS queues slow mmap builds that then hammer NFS; keep OpenClaw aware of both signals.
Patrol merge webhooks and health probes
Per node JSON should list free gigabytes, NFSv4.1 retransmit counts, and rsync state. Merge webhooks dedupe fragments and require quorum before healthy. Reuse bounded retries from the Flux webhook canary guide.
FAQ: stale file handle on NFSv4.1
Stale file handle means the server invalidated a fileid while clients still hold descriptors, often after export migration or rollback.
- Fix: stop writers, unmount, remount, restart mmap heavy daemons.
- Prevention: avoid soft mounts with long rsync deletes; prefer hard mounts plus health checks.
- Automation: repeated counters fail the merge webhook and trigger backoff.
Citable guardrails
- Mount contract: every actimeo and hard versus soft choice lives in Git with the approving OpenClaw fragment id.
- IO fairness: keep compile latency stable within eight percent week over week after ionice edits.
- Disk contract: skip overlapping rsync deletes when any node crosses yellow watermark.
Provision nodes for OpenClaw plus NFS heavy CI
Rent additional Mac mini M4 lanes when merge webhooks show sustained pressure, keep purchase and pricing pages open without login, and reuse this matrix as your default runbook appendix.