Skip to content

feat: Add macOS darwin-vz minimal harness#504

Open
lox wants to merge 17 commits into
mainfrom
lox/macos-cleanrooms-harness
Open

feat: Add macOS darwin-vz minimal harness#504
lox wants to merge 17 commits into
mainfrom
lox/macos-cleanrooms-harness

Conversation

@lox

@lox lox commented May 31, 2026

Copy link
Copy Markdown
Contributor

Cleanroom's macOS host backend still boots Linux guests. Replacing Tart needs a separate macOS VZ path that can create a local macOS VM bundle, install a guest agent, and prove host-to-guest exec before production adapter work starts.

This PR adds standalone benchmark tooling under benchmarks/darwin-vz/macos-minimal. The runner consumes a local bundle manifest, validates the disk, auxiliary storage, hardware model, machine identifier, display, and guest-agent metadata, then builds a macOS VZVirtualMachineConfiguration. It sends exec requests over the same newline-delimited vsock stream shape used by the existing guest exec path and streams guest stdout/stderr back to the host.

The bundle tooling can create a base VM from a local Apple Silicon IPSW, prepare a rootless bootstrap clone from that base, and finalize a LaunchDaemon-backed clone without Tart, SSH, Packer, host sudo, or mutating the base bundle. The default headless profile boots a temporary cron-only bootstrap once, runs sudo inside the guest to install the agent and LaunchDaemon as root:wheel, removes the temporary bootstrap dslocal record and crontab offline while the VM is stopped, then boots again to prove exec is served by the system LaunchDaemon as uid 0.

The finalizer also has a local GUI profile for proving app/session mechanics before production backend work. --profile gui keeps a non-admin autologin user, leaves the root LaunchDaemon on agent.port, rewrites the user's LaunchAgent to serve exec on user_agent.port, removes the bootstrap crontab offline, and then proves both root and user agents can serve commands. The GUI smoke launches TextEdit through the user agent. Guest-side screencapture is attempted but not required because the headless runner does not attach a VZ view.

That gives us repeatable local flows without Tart:

benchmarks/darwin-vz/macos-minimal/build-create-bundle.sh
benchmarks/darwin-vz/macos-minimal/build-guest-agent.sh
benchmarks/darwin-vz/macos-minimal/build-runner.sh

dist/darwin-vz-macos-create-bundle \
  --ipsw /path/to/UniversalMac.ipsw \
  --out /path/to/cleanroom-macos-base \
  --disk-size-gib 120

benchmarks/darwin-vz/macos-minimal/finalize-agent-bundle.sh \
  --base /path/to/cleanroom-macos-base \
  --out /path/to/cleanroom-macos-finalized \
  --metrics-dir /tmp/cleanroom-macos-finalize \
  --force

benchmarks/darwin-vz/macos-minimal/finalize-agent-bundle.sh \
  --base /path/to/cleanroom-macos-base \
  --out /path/to/cleanroom-macos-gui \
  --profile gui \
  --metrics-dir /tmp/cleanroom-macos-gui-finalize \
  --force

dist/darwin-vz-macos-minimal \
  --bundle /path/to/cleanroom-macos-gui \
  --agent user \
  -- /usr/bin/open -a TextEdit

A fresh local macOS 26.5 build 25F71 bundle finalized with the headless profile runs sw_vers over the root LaunchDaemon and returns exit code 0. A GUI-profile bundle now has agent.port=10700, user_agent.port=10701, validates both endpoints, runs a direct user-agent command as cleanroom, and passes the TextEdit launch smoke.

The standalone harness remains separate from backend integration. It gives us measured image-prep and boot-and-exec paths first, so the production adapter can consume either a root-agent bundle or a GUI-session bundle without baking Tart assumptions into the backend.

@lox lox force-pushed the lox/macos-cleanrooms-harness branch from 79665cf to 2046e32 Compare May 31, 2026 20:09
@lox lox marked this pull request as ready for review June 1, 2026 07:04
@lox

lox commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e592beeb0a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread cmd/cleanroom-macos-guest-agent/vsock_darwin.go Outdated
Comment thread benchmarks/darwin-vz/macos-minimal/runner.swift
Comment thread benchmarks/darwin-vz/macos-minimal/runner.swift
Comment thread benchmarks/darwin-vz/macos-minimal/prepare-agent-bundle.sh Outdated
@lox

lox commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@lox

lox commented Jun 1, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 4d41a536c6

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread cmd/cleanroom-macos-guest-agent/vsock_darwin.go Outdated
Comment thread benchmarks/darwin-vz/macos-minimal/finalize-agent-bundle.sh Outdated
Comment thread .buildkite/pipeline.yml Outdated
@lox

lox commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

1 similar comment
@lox

lox commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d06457bce8

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread cmd/cleanroom-macos-guest-agent/main.go Outdated
Comment thread benchmarks/darwin-vz/macos-minimal/runner.swift Outdated
Comment thread benchmarks/darwin-vz/macos-minimal/runner.swift Outdated
@lox

lox commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 98b960f532

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread benchmarks/darwin-vz/macos-minimal/finalize-agent-bundle.sh
Comment thread cmd/cleanroom-macos-guest-agent/vsock_darwin.go Outdated
@lox

lox commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 405bfbf9df

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread benchmarks/darwin-vz/macos-minimal/viewer.swift Outdated
@lox

lox commented Jun 2, 2026

Copy link
Copy Markdown
Contributor Author

@codex review

@chatgpt-codex-connector

Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@lox lox force-pushed the lox/macos-cleanrooms-harness branch from 032180a to 8f8cb46 Compare June 2, 2026 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant