Skip to content

Narrow blanket SPIR-V legalization work in optimizer recipes#6612

Open
AnastaZIuk wants to merge 9 commits intoKhronosGroup:mainfrom
Devsh-Graphics-Programming:unroll
Open

Narrow blanket SPIR-V legalization work in optimizer recipes#6612
AnastaZIuk wants to merge 9 commits intoKhronosGroup:mainfrom
Devsh-Graphics-Programming:unroll

Conversation

@AnastaZIuk
Copy link
Copy Markdown

@AnastaZIuk AnastaZIuk commented Mar 20, 2026

Summary

  • keep the default optimizer entry points unchanged
  • add dedicated fast compile performance and legalization entry points for the opt-in -O1experimental profile
  • make legalization-time full loop unroll explicit instead of inheriting blanket generic behavior
  • make legalization-time SSA rewrite explicit and mode-driven instead of applying the old blanket cleanup to every module
  • preserve valid OpImageTexelPointer image operands in LocalSingleStoreElim
  • keep AliasedPointer and RestrictPointer only on legal replacement memory object declarations after scalar replacement
  • remove dead function-local pointer temporaries and their dead stores after the narrowed cleanup collapses dead uses
  • extend dead-store collection through OpCopyObject so the late cleanup can remove copied dead pointer chains as well
  • trim stale VariablePointers / VariablePointersStorageBuffer declarations only in the -O1experimental fast path, without changing the default trim path

Why each change is needed

This branch is the optimizer side of the opt-in -O1experimental profile.

The goal is not to change the default optimizer behavior. The goal is to keep the default entry points intact and move the more aggressive recipe into explicit fast compile entry points.

That split needs three kinds of changes:

  1. A separate fast recipe.

    The generic default recipes should stay unchanged. The fast profile therefore gets its own performance and legalization entry points.

  2. Explicit control over the transforms that materially change IR shape.

    Legalization-time full loop unroll and legalization-time SSA rewrite are no longer implicit blanket work in the generic path. They are explicit parts of the fast path and are driven by narrow producer-side signals from the companion DXC branch.

  3. Minimal legality cleanup for the IR shapes that the fast path can expose.

    The fast path keeps the recipe win and the explicit unroll behavior, but it also needs a few narrow legality-oriented cleanups:

    • LocalSingleStoreElim can otherwise rewrite through copied image values and leave OpImageTexelPointer with a non-pointer image operand.
    • Scalar replacement can otherwise propagate AliasedPointer or RestrictPointer from a struct member onto replacements that are not legal memory object declarations for those decorations.
    • After the narrowed cleanup removes dead uses, dead function-local pointer temporaries can otherwise survive long enough to trip logical-addressing validation unless they are removed late together with their dead store chains.
    • The fast path can otherwise leave explicit VariablePointers / VariablePointersStorageBuffer declarations in the final module even after the final IR no longer contains the pointer forms that require them. A dedicated follow-up pass trims only those stale declarations and stays off the shared default trim path.

These are the only extra legality cleanups kept here. The branch does not restore the old blanket generic cleanup.

Spec basis

Loop control is only a hint in core SPIR-V:

Unroll - Performance hint. Strong request to unroll or unwind this loop.

DontUnroll - Performance hint. Strong request to keep this loop as a loop, without unrolling.

Spec:

Khronos guidance for offline transforms is aligned with that:

general loop unwinding or unrolling

should be avoided in off-line transforms of SPIR-V meant to be portable across devices.

Such controls should be respected by target devices.

Whitepaper:

The image-operand rule for OpImageTexelPointer is explicit:

Image must have a type of OpTypePointer with Type OpTypeImage.

Spec:

The aliasing decorations are also explicit:

RestrictPointer ... Apply only to a memory object declaration

AliasedPointer ... Apply only to a memory object declaration

Spec:

And the aliasing section narrows that further for physical-storage-buffer pointers:

For variables holding PhysicalStorageBuffer pointers, applying the AliasedPointer decoration on the OpVariable indicates that the PhysicalStorageBuffer pointers are potentially aliased.

Applying RestrictPointer is allowed, but has no effect.

Spec:

The late dead-local cleanup is justified by the logical-pointer rules:

If neither the VariablePointers nor VariablePointersStorageBuffer capabilities are declared ... OpVariable must not allocate an object whose type is or contains a logical pointer type.

Spec:

Validation

  • fresh local CodeGenSPIRV validation on the companion DXC / SPIRV-Tools branch state passed with 1438 expected passes, 2 expected failures, and 0 unexpected failures
  • additional local end-to-end validation covered HLSL shaders with storage image arrays, image atomics, and PhysicalStorageBuffer pointer wrappers in a downstream consumer tree, including https://github.com/Devsh-Graphics-Programming/Nabla-Examples-and-Tests

Companion DXC PR:

@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Mar 20, 2026

CLA assistant check
All committers have signed the CLA.

@AnastaZIuk AnastaZIuk marked this pull request as ready for review March 20, 2026 18:18
@AnastaZIuk AnastaZIuk changed the title Narrow blanket SPIR-V loop unroll in optimizer recipes Narrow blanket SPIR-V legalization work in optimizer recipes Mar 20, 2026
Copy link
Copy Markdown
Collaborator

@s-perron s-perron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have responded on the corresponding DXC pr: microsoft/DirectXShaderCompiler#8283 (review).

Comment thread source/opt/optimizer.cpp
Comment on lines 133 to 134
// Make sure uses and definitions are in the same function.
.RegisterPass(CreateInlineExhaustivePass())
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whats the purpose here?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inline enables many other optimizations. We do not implement inter-procedural-optimizations. If you are going to copy-propagate something written to in one function, and used in another function, the have to be inlined.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I just hacked a NBL_REF_ARG via expanding vk::ext_reference on regular function inout parameters, so we have less copies, but yes makes sense.

The O1experimental fast performance path can leave explicit
VariablePointers / VariablePointersStorageBuffer declarations in the final
module even after the final IR no longer contains the pointer forms that
require them.

In our EX37 sampler workload the resulting SPIR-V remained legal and the
failing shader contained only scalar OpSelect %float instructions, with no
pointer OpSelect or pointer OpPhi. Removing only the stale capability lines
fixed the downstream runtime corruption.

Keep the shared TrimCapabilitiesPass and the default optimizer paths
untouched by adding a dedicated TrimVariablePointersCapabilitiesPass and
invoking it only at the end of the fast performance recipe. Preserve real
Workgroup and StorageBuffer variable-pointer cases with focused tests.
VariablePointers is needed when the final module still uses pointer values as first-class SSA objects. A plain OpStore stores through a pointer. It does not by itself prove that the module still needs VariablePointers.

The SPIR-V OpStore definition distinguishes the pointer to store through from the object being stored. The variable pointer rules separately constrain cases where a pointer is the Object operand of OpStore or the result of OpLoad. An ordinary StorageBuffer store of a non-pointer object should therefore not keep VariablePointers alive on its own.

Stop treating every OpStore as a capability requirement and add a regression test for a normal StorageBuffer store that should trim the stale capability.

Spec references:
https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#OpStore
https://registry.khronos.org/SPIR-V/specs/unified1/SPIRV.html#VariablePointers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants