[metal] arm64_32 (ILP32/watchOS) support by matthargett · Pull Request #9411 · gfx-rs/wgpu

matthargett · 2026-04-11T00:33:26Z

Connections

Description

Three fixes to make the Metal backend work on arm64_32 (Apple Watch watchOS, ILP32 ABI where sizeof(long) == sizeof(void*) == 4).

1. Use `MTLStorageMode::Shared` for textures on arm64_32

The AGXMetalS4 driver (A13/S6 GPU) crashes with KERN_INVALID_ADDRESS at 0x50 during copyFromTexture:toBuffer: on MTLStorageMode::Private textures on ILP32. The native Swift Metal implementation that works on the same hardware uses Shared storage for render textures. Apple's unified memory architecture makes Shared equally performant for GPU access.

Gated behind cfg!(target_pointer_width = "32") — zero effect on 64-bit.

2. Disable buffer mutability hints on arm64_32

The AGXMetalS4 driver exhibits instability when MTLMutability hints are combined with Shared storage mode. Conservative disable on 32-bit targets. Can be re-enabled per-device with broader test coverage.

3. Fix `CGFloat` type in `surface.rs`

CGFloat is f64 on LP64 but f32 on ILP32. CGSize::new() was hardcoded to as f64, which fails to compile on arm64_32. Changed to as _ to infer the correct type.

Testing

Validated on physical hardware with wgpu-native v29:

Apple Watch SE2 (arm64_32, S6/A13, watchOS 11.6): full pipeline passes — WGSL→MSL via naga, compute dispatch, render pipeline, 8 indirect draws, texture-to-buffer readback ✅
Apple Watch Series 10 (arm64, S9/A15, watchOS 26.4): no regressions ✅

The AGXMetalS4 driver (A13/S6 GPU, used in Apple Watch Series 6-9 and SE2) crashes with KERN_INVALID_ADDRESS at offset 0x50 during copyFromTexture:toBuffer: on MTLStorageMode::Private textures when called via objc_msgSend on the ILP32 (arm64_32) ABI. The native Swift Metal implementation that works on the same hardware uses MTLStorageMode::Shared for render textures. Apple's unified memory architecture makes Shared equally performant for GPU access while enabling the blit DMA path that the driver expects on ILP32. This change is gated behind cfg!(target_pointer_width = "32") and has zero effect on 64-bit platforms. Tested on: - Apple Watch SE2 (arm64_32, S6/A13, watchOS 11.6) - Apple Watch Series 10 (arm64, S9/A15, watchOS 26.4) — no regression

After the Shared texture storage mode fix, the AGXMetalS4 driver (A13/S6 GPU on watchOS arm64_32) still exhibits instability when MTLMutability hints are set on pipeline buffer descriptors. Conservatively disable supports_mutability on 32-bit targets. Can be re-enabled per-device once broader watchOS test coverage confirms stability. Gated behind cfg!(target_pointer_width = "32") — no effect on 64-bit platforms.

CGFloat is f64 on LP64 but f32 on ILP32 (arm64_32, used by watchOS). CGSize::new() expects CGFloat, so use `as _` to let the compiler infer the correct type instead of hardcoding `as f64`.

matthargett · 2026-04-11T02:10:03Z

CI fail looks unrelated:
error: failed to load source for dependency libtest-mimic

Caused by:
Unable to update https://github.com/cwfitzgerald/libtest-mimic.git?rev=9979b3c

Caused by:
revspec '9979b3c' not found; class=Reference (4); code=NotFound (-3)

inner-daemons · 2026-04-11T03:46:26Z

Related article since I was interested in this:
https://www.phoronix.com/news/GCC-May-Deprecate-ARM64-ILP32

At the end of that it mentions that GCC later deprecated this.

Also, it looks like this is an architecture used exclusively for Apple Watches where the registers are 64bit but the pointers are 32bit. I also think that newer apple watches moved away from this, since there is a aarch64 watchos target triple, and this target is tier 2 whereas arm64_32 is tier 3.

matthargett · 2026-04-11T05:05:20Z

Related article since I was interested in this: https://www.phoronix.com/news/GCC-May-Deprecate-ARM64-ILP32

At the end of that it mentions that GCC later deprecated this.

I can understand why: Apple ecosystem almost exclusively uses LLVM/clang/swift, so it would make sense for GCC to drop the maintenance overhead.

Also, it looks like this is an architecture used exclusively for Apple Watches where the registers are 64bit but the pointers are 32bit. I also think that newer apple watches moved away from this, since there is a aarch64 watchos target triple, and this target is tier 2 whereas arm64_32 is tier 3.

yes, Apple Watch 9 (Ultra 2, SE 3) and newer are now on pure arm64, and Apple Watch 6/7/8/SE2 are the arm64_32 ABI which are still supported in watchOS 26. as I mentioned in the issue text, I was testing on Apple Watch SE2 (arm64_32 / A12) and Apple Watch 10 (arm64 / A15). That's still ~100 million devices, but also some of these issues I've supplied patches that should affect any Apple 32-bit platform. I personally want my WebGPU-oriented app to reach more of the majority of the ~170M active devices, including users who have hand-me-down devices.

FWIW, the A12-derived GPU (and it's Metal) in the arm64_32 watchOS 26 watches is surprisingly capable. Happy to include a demo video, if that's helpful.

inner-daemons · 2026-04-11T06:03:20Z

Oh, I'm not at all trying to argue that this is bad. Just throwing this out there. And yeah, I figured people were using clang anyway.

inner-daemons

LGTM, nothing too controversial here.

inner-daemons · 2026-04-13T18:39:45Z

wgpu-hal/src/metal/device.rs

+                // On arm64_32 (watchOS ILP32), the AGXMetalS4 driver (A13/S6 GPU)
+                // crashes in copyFromTexture:toBuffer: on Private textures — null
+                // deref at offset 0x50 in the driver's internal texture state. Use
+                // Shared storage which works correctly on Apple's unified memory
+                // architecture and matches what native Swift Metal code uses on
+                // these devices.
+                MTLStorageMode::Shared


This is an interesting bug, is it documented anywhere?

Not that I'm aware of. There's a couple of reasons this amazing capability of these Apple Watch devices are shrouded:

the feature/capability flags will say a feature isn't available, but the MSL will actually just work. before I ported wgpu, I tested Metal exhaustively to figure out what actually works. This was kicked off by seeing the Memoji app on my child's Apple Watch 9 and realizing the functionality it implied.

if you try to do some of these MSL features in the simulator, you'll get a hard abort or it just won't work. I'm guessing most people (very reasonably) give up.

some features, like ASTC HDR textures, work intermittently (shows magenta sometimes) on Apple Watch 6/SE2, but works fine on Apple Watch 9 and later. I haven't pinned down why, but again I'm assuming this would ward off most mildly curious app/3D developers.

This is probably a good conference/meetup talk to commoditize this hard-won knowledge, happy to apply if anyone has suggestions on a good venue!

matthargett added 3 commits April 9, 2026 23:02

[metal] Fix CGFloat type for ILP32 (arm64_32)

20edfc1

CGFloat is f64 on LP64 but f32 on ILP32 (arm64_32, used by watchOS). CGSize::new() expects CGFloat, so use `as _` to let the compiler infer the correct type instead of hardcoding `as f64`.

matthargett mentioned this pull request Apr 11, 2026

Support arm64_32 (ILP32) target — watchOS on Apple Watch6/7/8/SE2 #9406

Open

Merge branch 'trunk' into fix/metal-arm64_32-ilp32

e4204a2

inner-daemons self-requested a review April 11, 2026 03:17

Merge branch 'trunk' into fix/metal-arm64_32-ilp32

5c66b20

inner-daemons approved these changes Apr 13, 2026

View reviewed changes

Merge branch 'trunk' into fix/metal-arm64_32-ilp32

665cf68

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[metal] arm64_32 (ILP32/watchOS) support#9411

[metal] arm64_32 (ILP32/watchOS) support#9411
matthargett wants to merge 6 commits intogfx-rs:trunkfrom
rebeckerspecialties:fix/metal-arm64_32-ilp32

matthargett commented Apr 11, 2026

Uh oh!

matthargett commented Apr 11, 2026

Uh oh!

inner-daemons commented Apr 11, 2026

Uh oh!

matthargett commented Apr 11, 2026

Uh oh!

inner-daemons commented Apr 11, 2026

Uh oh!

inner-daemons left a comment

Uh oh!

inner-daemons Apr 13, 2026

Uh oh!

matthargett Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

matthargett commented Apr 11, 2026

Connections

Description

1. Use MTLStorageMode::Shared for textures on arm64_32

2. Disable buffer mutability hints on arm64_32

3. Fix CGFloat type in surface.rs

Testing

Uh oh!

matthargett commented Apr 11, 2026

Uh oh!

inner-daemons commented Apr 11, 2026

Uh oh!

matthargett commented Apr 11, 2026

Uh oh!

inner-daemons commented Apr 11, 2026

Uh oh!

inner-daemons left a comment

Choose a reason for hiding this comment

Uh oh!

inner-daemons Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

matthargett Apr 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

1. Use `MTLStorageMode::Shared` for textures on arm64_32

3. Fix `CGFloat` type in `surface.rs`