[Tile] Use unpacked vector field for Tile16x16/Tile32x32 register storage by hughperkins · Pull Request #722 · Genesis-Embodied-AI/quadrants

hughperkins · 2026-06-05T16:46:33Z

Summary

Replace the hand-rolled r0..rN-1: dtype field declarations and their matching cascades in Tile16x16 / Tile32x32 with a single

r: qd.types.vector(_TILE, dtype, unpacked=True)

field, accessed as self.r[k]. With python-int / qd.static-resolved indices the unpacked vector still maps to one independent register slot per use, so the generated PTX/LLVM IR is unchanged — but the source shrinks dramatically (net -870 lines).

Also drops the now-redundant private helpers (_get_col, _set_col, _r) and the _REGS field-name table. These were all _-prefixed and only used internally to the two tile modules.

Test plan

pre-commit run -a (black, ruff, pylint): clean
pyright python/quadrants/lang/simt/_tile16.py python/quadrants/lang/simt/_tile32.py: 0 errors
python tests/run_tests.py -v -t1 test_tile on an RTX PRO 6000 cluster node: 732 passed, 182 skipped, 0 failed (~10 min); covers cuda+vulkan, f32+f64, ndarray+field for both tile sizes, including cholesky_, solve_triangular_, qd.outer(...) rank-1 updates, slice load/store, and the blocked-Cholesky demo
Existing 68 PURE.VIOLATION warnings on TILE / SIZE test globals are pre-existing and unrelated

Made with Cursor

…rage Replace hand-rolled ``r0..rN-1: dtype`` field declarations and their matching ``if k == 0: self.r0 = val; ...`` cascades with a single ``r: qd.types.vector(_TILE, dtype, unpacked=True)`` field accessed via ``self.r[k]``. This shrinks the surface area significantly (net -870 lines) without changing the generated PTX/LLVM IR: with python-int / qd.static-resolved indices the unpacked field still maps to one register slot per use, matching what the explicit cascade produced. Also removes the now-redundant private helpers ``_get_col``, ``_set_col``, ``_r`` and the ``_REGS`` field-name table.

github-actions · 2026-06-05T17:21:50Z

Total: 2 file(s) changed, +44 -878 code lines.

github-actions · 2026-06-05T18:05:54Z

Diff coverage: 5% · 44 lines, 42 missing

…ed on N The two factory bodies were structurally identical except for ``_TILE = 16`` vs ``_TILE = 32``. Replace them with a single ``_make_tile_class(N, dtype)`` factory and a single ``_TileProxy(N)`` proxy class, then instantiate ``Tile16x16Proxy = _TileProxy(16)`` and ``Tile32x32Proxy = _TileProxy(32)``. Net diff for this commit: -343 lines. Same generated IR. Updates the few internal consumers (``simt/__init__.py``, ``tile_slicing.py``, ``quadrants/__init__.py``, ``tests/python/test_tile.py``) and a couple of stale ``test_tile16`` references in the docs.

github-actions · 2026-06-05T20:44:15Z

Total: 7 file(s) changed, +309 -1364 code lines.

github-actions · 2026-06-05T21:39:20Z

Diff coverage: 59% · 258 lines, 106 missing

github-actions · 2026-06-08T13:24:33Z

Total: 7 file(s) changed, +309 -1364 code lines.

github-actions · 2026-06-08T14:17:13Z

Diff coverage: 59% · 258 lines, 106 missing

hughperkins · 2026-06-08T16:35:08Z

Benchmarks on genesis:

dex_hand regression seems concerning 🤔

hughperkins · 2026-06-08T16:45:35Z

tests pass at least:

…-vector Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # python/quadrants/lang/simt/_tile16.py # python/quadrants/lang/simt/_tile32.py # tests/python/test_tile.py

hughperkins · 2026-06-29T15:49:59Z

Genesis benchmarks:

github-actions · 2026-06-29T15:52:46Z

Total: 7 file(s) changed, +314 -1368 code lines.

github-actions · 2026-06-29T16:35:29Z

Diff coverage: 55% · 263 lines, 118 missing

…acked-vector + fusion refactor 1. ``_trsm`` had been changed from a runtime ``for c in range(_TILE)`` loop to a fully-unrolled ``qd.static(range(N))`` so that ``self.r[j]`` (an unpacked vector field) could be accessed with a python-int index. Forcing full unrolling spikes the live set: the resulting PTX for the blocked Cholesky kernel jumps from 653 to 894 .b32 registers (+37%) and the shuffle count from 174 to 304, producing a measurable ~9.4% slowdown on ``misc/demos/cholesky_blocked.py`` (N=92, 4096 envs) and (per genesis benchmarking) a 4-7% regression on contact-heavy Newton scenarios (box_pyramid, dex_hand, double_smplx). Restore the runtime ``range(N)`` outer/inner loops and introduce ``_get_col`` / ``_set_col`` helpers that emit explicit static-unrolled cascades over the unpacked vector slots -- functionally equivalent to the hand-rolled ``r0..rN-1`` cascade ``_tile16.py`` / ``_tile32.py`` used to carry, but driven by the new ``self.r[kk]`` access. Post-fix the PTX for the demo kernel is byte-identical to main (modulo the session nonce). 2. ``_resolve_vec2d`` / ``_resolve_vec3d`` had been seeded with ``v = dtype(0.0)``. This trips the identity-keyed type-construction path in the AST transformer when the kernel's ``dtype`` is a ``copy.deepcopy`` of a primitive (which is what ``qd.init(default_fp=...)`` produces). Swap to the identity-independent ``qd.cast(0.0, dtype)``, matching the pre-fusion fix (#738) that I lost during the merge. Restores ``test_vec_proxy_non_identity_dtype`` to passing. Co-authored-by: Cursor <cursoragent@cursor.com>

github-actions · 2026-06-29T17:21:16Z

Total: 7 file(s) changed, +329 -1368 code lines.

github-actions · 2026-06-29T18:16:54Z

Diff coverage: 58% · 276 lines, 115 missing

Merge branch 'main' into hp/tiles-use-unpacked-vector

fbde88e

hughperkins temporarily deployed to publish_pypi June 8, 2026 13:44 — with GitHub Actions Inactive

hughperkins temporarily deployed to publish_pypi June 8, 2026 14:03 — with GitHub Actions Inactive

hughperkins temporarily deployed to publish_pypi June 8, 2026 14:24 — with GitHub Actions Inactive

Merge remote-tracking branch 'origin/main' into hp/tiles-use-unpacked…

a59a936

…-vector Co-authored-by: Cursor <cursoragent@cursor.com> # Conflicts: # python/quadrants/lang/simt/_tile16.py # python/quadrants/lang/simt/_tile32.py # tests/python/test_tile.py

Uh oh!

Conversation

hughperkins commented Jun 5, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

github-actions Bot commented Jun 5, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

github-actions Bot commented Jun 8, 2026

Uh oh!

hughperkins commented Jun 8, 2026

Uh oh!

hughperkins commented Jun 8, 2026

Uh oh!

hughperkins commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

github-actions Bot commented Jun 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant