Suballocate DX12 buffer creation by Elabajaba · Pull Request #3163 · gfx-rs/wgpu

Elabajaba · 2022-11-02T06:50:55Z

Checklist

~~[ ] Blocked on Migrate to Windows-rs from winapi #3207~~ Worked around by feature gating it behind the windows_rs feature
Run cargo clippy.
Add change to CHANGELOG.md. See simple instructions inside file.
Fix the buffer and texture usage flags.
Don't panic if deallocation fails when destroying a texture or resource.
- Is panic() on deallocation failure even a real issue?
- What should happen on deallocation failure, considering it can't (currently) return an error?
Figure out what to do with heap_created_not_zeroed.
Fix the error handling so the gpu-allocator -> wgpu error implementation is not just a bunch of todo!().
Consider waiting until Add support for presser Traverse-Research/gpu-allocator#138 lands, and see if migrating to presser would be needed
- More safety!
- Might break a lot of user code...
- That user code is probably UB anyways?
Investigate if changing gpu-alloc to support dx12 is feasible, and would be accepted
- Thoughts on potential dx12 support? zakarumych/gpu-alloc#66
- imo probably not, as gpu-alloc seems tightly coupled to the vulkan way, and dx12's memory management is different enough from vulkan's to make it painful to try and fit into the vulkan way

Connections
~~ Blocked on #3207 ~~ Worked around by feature gating it behind the windows_rs feature
closes #2720

Description
DX12 is currently quite slow in wgpu. This uses gpu-allocator to batch together allocations into heaps and uses CreatePlacedResource instead of CreateCommittedResource to create buffers and textures, which leads to large performance gains (~30-50% in "normal" scenarios, with significantly larger gains in write_buffer heavy scenarios (~250x+ in an unrealistic scenario where it calls write_buffer 1000x in a loop, going from ~1fps to ~250fps)), and in my testing no performance decreases.

Testing
Tested the examples, ran cargo test, backported it to 0.14 and tested against bevy+bistro, and tested against a modified water example where it loops the render write_buffer 1000x times on the main thread, 500x each on 2 scoped threads, or 100x each on 10 scoped threads to make sure multithreading wouldn't panic.

It was quite a bit faster in all of these scenarios, except for bevy+bistro at 4k where it was heavily gpu limited and ran about the same.

Potential Future Improvements

Make these into issues
Consider if adding a way to get the vulkan/dx12/etc allocator is worthwhile
Consider actually suballocating resources instead of just suballocating heaps. DX12 allows for multiple bindings to the same resource through subresources.
- This would probably require changing the unmap_buffer trait to pass in the subresource id for unmapping a buffer
Figure out why dx12 is still slower than Vulkan when calling write_buffer a lot
- Basic profiling seems to point to windows syscalls taking the majority of the time, with ntdllZwAllocateLocallyUniqueId taking almost 40% of the time in the 1000x looped write_buffer test
Most dx12 things are already thread safe, see if it's possible to avoid wrapping them in rust sync primitives to avoid multiple levels of locks and/or reference counting

Error handling stub

Jasper-Bekkers · 2022-11-14T13:11:30Z

Nice to see these changes @Elabajaba, if there are any features that wgpu would like to see in our allocator please just file an issue on the repo.

Elabajaba · 2022-11-15T08:32:10Z

Blocked on #3207 until Mozilla gets around to vendoring windows-rs.

codecov-commenter · 2022-11-21T23:24:43Z

Codecov Report

Merging #3163 (719d26c) into master (052bd17) will increase coverage by 0.05%.
The diff coverage is 85.71%.

@@            Coverage Diff             @@
##           master    #3163      +/-   ##
==========================================
+ Coverage   64.30%   64.36%   +0.05%     
==========================================
  Files          83       85       +2     
  Lines       42270    42397     +127     
==========================================
+ Hits        27181    27287     +106     
- Misses      15089    15110      +21

Impacted Files	Coverage Δ
wgpu-hal/src/dx12/mod.rs	`27.27% <0.00%> (-0.09%)`	⬇️
wgpu-types/src/lib.rs	`88.30% <ø> (ø)`
wgpu-types/src/assertions.rs	`50.00% <50.00%> (ø)`
wgpu-hal/src/dx12/suballocation.rs	`85.49% <85.49%> (ø)`
wgpu-hal/src/dx12/device.rs	`88.45% <97.14%> (+0.25%)`	⬆️
wgpu-hal/src/auxil/dxgi/result.rs	`65.38% <0.00%> (-11.54%)`	⬇️
wgpu-core/src/validation.rs	`58.87% <0.00%> (-0.14%)`	⬇️
wgpu-core/src/device/mod.rs	`66.72% <0.00%> (+0.04%)`	⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

MarijnS95 · 2022-11-23T11:00:18Z

Just stumbling upon this PR, I'll see to accelerating Traverse-Research/gpu-allocator#138 so that you're unblocked on that regard!

cwfitzgerald · 2022-11-26T04:20:52Z

As #3207 is basically perma-blocked until further notice, I think we should work around the situation by having a feature flag and falling back to the old behavior when it's disabled. This will let us continue to innovate, and also not force the issue with moz.

…and which is the slow path

cwfitzgerald

Thank you so much for all this work! Looks great!

Elabajaba added 6 commits November 1, 2022 02:47

Suballocate buffers

e9192d9

Error handling stub

a9d5189

Error handling stub

Suballocate Textures

1868f26

cleanup

baea831

cargo.toml workspace

f9e44f2

Appease CI, cast ptr instead of as

34c42b4

Elabajaba changed the title ~~Temp dx12alloc~~ Suballocate DX12 buffer creation Nov 2, 2022

Elabajaba added 3 commits November 21, 2022 16:39

Merge remote-tracking branch 'origin/master' into temp-dx12alloc

ffce250

unsafe_op_in_unsafe_fn

029318c

dx12 handle gpu-allocator errors

a782dfc

Elabajaba added 2 commits November 26, 2022 02:42

Stick gpu-allocator behind a feature until gfx-rs#3207 lands

2ee1a41

gpu-allocator 0.21

c24ecf8

cwfitzgerald reviewed Nov 26, 2022

View reviewed changes

Comment thread wgpu-hal/src/dx12/device.rs Outdated

Elabajaba added 5 commits December 7, 2022 20:55

move gpu-allocator stuff into it's own file

6d3d10f

clippy

b574495

Merge remote-tracking branch 'origin/master' into temp-dx12alloc

89b13fc

Merge remote-tracking branch 'origin/master' into temp-dx12alloc

ef83d0f

cargo update so it builds

edb3b49

Elabajaba mentioned this pull request Dec 8, 2022

Update to ash 0.37.1 to replace deprecated function call #3273

Merged

3 tasks

Merge remote-tracking branch 'origin/master' into temp-dx12alloc

fdff34a

cwfitzgerald reviewed Dec 9, 2022

View reviewed changes

Elabajaba added 4 commits December 8, 2022 21:27

cleanup

2dace15

unwrap_unchecked when getting the allocator

757b841

changelog

079a343

Merge remote-tracking branch 'origin/master' into temp-dx12alloc

40e4912

can't use workspace inheritance :(

1afd89f

Elabajaba marked this pull request as ready for review December 19, 2022 08:04

cwfitzgerald requested changes Dec 20, 2022

View reviewed changes

Elabajaba added 6 commits December 19, 2022 20:34

Merge remote-tracking branch 'origin/master' into temp-dx12alloc

53b4309

Implement strict_assert for unwrap_unchecked

35c5830

split suballocation into 2 inline modules for readability

1481854

fmt

744bc0a

comments on why suballocation.rs exists, point out which is the fast …

10ef37c

…and which is the slow path

default windows_rs enabled for wgpu-hal

719d26c

cwfitzgerald approved these changes Dec 20, 2022

View reviewed changes

cwfitzgerald merged commit f3c5091 into gfx-rs:master Dec 20, 2022

cwfitzgerald mentioned this pull request Dec 20, 2022

Fix up strict-assert usage #3320

Merged

2 tasks

xiaopengli89 mentioned this pull request Mar 1, 2023

Add suballocation feature #3544

Closed

Conversation

Elabajaba commented Nov 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jasper-Bekkers commented Nov 14, 2022

Uh oh!

Elabajaba commented Nov 15, 2022

Uh oh!

codecov-commenter commented Nov 21, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

MarijnS95 commented Nov 23, 2022

Uh oh!

cwfitzgerald commented Nov 26, 2022

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cwfitzgerald left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Elabajaba commented Nov 2, 2022 •

edited

Loading

codecov-commenter commented Nov 21, 2022 •

edited

Loading