Multi-tenant OpenClaw on clustervps stops being a science project when every Mac node agrees on where fragments live, which build is running, and what “healthy” means. The smallest reproducible cluster is a directory contract plus one merged probe that already understands Doctor output and webhook digest back-pressure.

This guide targets three collaborating Mac nodes: two tenant-facing gateways and one notifier that batches webhook failures. Installation is intentionally boring—bootstrap identical users, copy launchd plist templates from your golden image, and only then overlay tenant fragments. Orchestration means every promotion is a git pull plus a tagged reload, never a midnight copy-paste. When something misbehaves, triage in this order: directory drift, version skew, probe JSON, token overlap. For a complementary take on multi-AZ gateways, read the multi-AZ gateway playbook and keep capacity math beside the Mac plan catalog.

Directory conventions

Treat each tenant as a slice, not a fork. On every clustervps host mirror /etc/openclaw/tenants/<tenant>/{gateway,doctor,webhook}.d/ so partial files compose deterministically. Gateway fragments hold listeners and upstream weights; Doctor fragments describe invariants you expect launchd to prove every five minutes; webhook fragments only declare endpoints and retry budgets. Never mix secrets inside fragments—mount /var/db/openclaw/secrets/<tenant> with POSIX ACLs so operators can diff YAML without seeing bearer tokens.

  • Include order: Number files 10-, 20- so merges are predictable when two teams ship the same tenant.
  • Node parity: rsync fragments from a build Mac to gateways before reload; keep SQLite queues off the sync path.
  • Rollback: Keep the previous directory as .prev symlink so a bad merge is one ln -nfs away from recovery.

Document the tree next to your help center screenshots so new hires stop inventing alternate paths. If you need another dedicated seat for a staging tenant, the purchase page stays public—no console login wall.

Version lock

OpenClaw’s CLI and daemon move quickly; mixed builds across gateways are the fastest way to get incompatible fragment schemas. Write openclaw.lock beside your infrastructure repo with the daemon hash, Swift runtime build, and companion agent version. Your deploy script should refuse to reload if the lockfile does not match openclaw version --json output on the target Mac.

Pin Swift toolchains and Xcode command-line tools the same day you bump OpenClaw. Silent macOS updates that jump Swift versions have killed more clusters than traffic spikes.

Promotion flow: bump the lock on a canary gateway, run Doctor with --tenant acme, replay recorded webhook fixtures, then fan the lockfile to peers using your configuration management tool of choice. If any node drifts, block traffic with your load balancer rather than hoping fragments still parse.

Health probe merge

Load balancers, Kubernetes agents, and humans should curl exactly one readiness URL per gateway. That handler must fuse three signals: Doctor exit codes translated into JSON, disk and queue pressure, and the notifier’s last webhook digest block so you see systemic partner outages instead of five hundred raw retries. The digest is not a second probe—it is a read-only view maintained by the notifier and consumed here.

#!/usr/bin/env bash
set -euo pipefail
/usr/local/bin/openclaw doctor --tenant "${TENANT}" --json >/tmp/doctor.json
/usr/bin/curl -fsS --max-time 3 "http://127.0.0.1:9099/v1/webhook-digest" -o /tmp/digest.json
/usr/sbin/diskutil apfs list | /usr/bin/grep -q "Container"
/usr/bin/python3 - <<'PY'
import json,sys
d=json.load(open("/tmp/doctor.json"));g=json.load(open("/tmp/digest.json"))
print(json.dumps({"doctor":d.get("ok"),"digest_age_sec":g.get("age_sec"),"failures":g.get("top",[])}))
PY

Gateways emit structured webhook failures to the notifier over a loopback UNIX socket; the notifier deduplicates HTTP status families, rolls five-minute windows, and exposes the digest API consumed above. Broadcast summaries to Slack or email from the notifier only, so gateways stay focused on TLS and scheduling. When Doctor reports yellow but webhooks are clean, return HTTP 200 with "status":"degraded" so traffic continues while dashboards catch attention.

Token rotation

Rotation is a data-plane exercise: mint shadow bearer tokens in your vault, materialize them as token.next beside token.current, and teach launchd EnvironmentFiles to read both until validation finishes. Canary gateways pick up token.next first, send synthetic webhook deliveries, then promote by atomically renaming files so no process reads a half-written secret.

Keep overlap for at least twelve hours across all Mac nodes, log actor plus ticket ID on every swap, and revoke legacy secrets only after two composite probe cycles report green. Shorten the calendar cadence to thirty days whenever hooks traverse the public internet or a vendor discloses cross-tenant risk. Pair the schedule with a spreadsheet that records which node last acknowledged success so silent failures do not masquerade as completed rotations.

Step checklist

  1. Install baseline. Provision three clustervps Macs with identical OS builds, install OpenClaw from the tarball referenced in openclaw.lock, and verify openclaw version --json matches on each host.
  2. Materialize directories. Create tenant fragment paths, sync numbered YAML from git, and validate includes with openclaw config lint --tenant <id> before touching launchd.
  3. Orchestrate reload. Load plist units, start the notifier first, then gateways; confirm the digest API returns age_sec under your SLO.
  4. Merge probes. Point the load balancer at the composite readiness route, fail a canary webhook on purpose, and confirm the digest shows a single summarized row instead of spam.
  5. Rehearse rotation. Stage shadow tokens, run synthetic deliveries, promote atomically, and revoke legacy secrets after green probes plus audit log entries.
  6. Troubleshoot fast. If probes flap, diff directories between nodes, re-check lockfiles, inspect digest staleness, then verify token overlap windows—only then open vendor tickets.

Rapid answers: Fragments belong in git, secrets never do. Doctor failures should block readiness unless you explicitly mark a tenant maintenance window. Interns can rehearse rotations on staging tenants with shadow tokens and recorded webhook fixtures without touching production queues. Return to the homepage anytime for public context before signing into the console.

Operational guidance only. OpenClaw deployment details vary by release; validate flags against your installed build. Network timings depend on ISP paths and are engineering targets, not contractual guarantees.
No login required to explore

Scale Mac gateways the same way you scale ideas

Review plans, skim help articles, or return to the homepage—every link opens publicly so your team can align before anyone signs into the console.

Browse Mac plans Open help center