Optimize unsigned LEB128 decoding more by d-e-s-o · Pull Request #875 · gimli-rs/gimli

d-e-s-o · 2026-04-14T22:29:17Z

Optimize LEB128 decoding some more (after #795). Improvements are more significant as per my testing. Please refer to individual commits.

Override the default Reader::read_u8() implementation for EndianSlice with a direct split_first() call. The default implementation goes through a rather long call chain (read_u8 -> read_u8_array::<[u8;1]> -> read_slice(&mut [u8;1]) -> inherent read_slice(1) + copy_from_slice), which some compiler/flag combinations fail to inline. When that happens, each byte read in the LEB128 hot loop compiles to an indirect function call (with duplicated bounds check etc.). On compilers that already inlined properly, this change should have no negative effect, but on those that didn't the improvement can be quite significant: leb128 unsigned small: -72% leb128 unsigned large: -77% leb128 u16 small: -66% parse .debug_info expressions: -44% evaluate .debug_info expressions: -10% Signed-off-by: Daniel Müller <deso@posteo.net>

In read::unsigned(), the shift == 63 overflow check ran on every iteration of the loop despite only being relevant on the 10th (and final) byte. Move it out by capping the loop with shift >= 63 and handling the last byte separately after the loop exits. This has two effects: First, it removes a comparison and conditional branch from each iteration of the hot loop. Second -- and more impactful -- it gives the compiler a known upper bound on the iteration count, which can enable LLVM to fully unroll the loop. I checked results on two systems, both showing positive outcomes: System 1: > leb128 unsigned small time: [459.40 ns 460.70 ns 462.23 ns] > change: [−4.9535% −4.4800% −4.0365%] (p = 0.00 < 0.05) > Performance has improved. > leb128 unsigned large time: [104.40 ns 104.57 ns 104.82 ns] > change: [−15.018% −14.476% −13.628%] (p = 0.00 < 0.05) > Performance has improved. > leb128 u16 small time: [461.00 ns 462.73 ns 464.39 ns] > change: [−22.716% −22.316% −21.947%] (p = 0.00 < 0.05) > Performance has improved. > parse .debug_info expressions > time: [63.729 µs 63.913 µs 64.141 µs] > change: [−0.8179% +0.5747% +1.9911%] (p = 0.43 > 0.05) > No change in performance detected. > evaluate .debug_info expressions > time: [517.71 µs 519.14 µs 520.82 µs] > change: [−1.5918% −0.8249% +0.0344%] (p = 0.04 < 0.05) System 2: > leb128 unsigned small time: [896.75 ns 902.94 ns 911.08 ns] > change: [−9.8227% −8.8646% −7.9089%] (p = 0.00 < 0.05) > Performance has improved. > leb128 unsigned large time: [164.77 ns 166.96 ns 170.68 ns] > change: [−44.354% −43.307% −42.114%] (p = 0.00 < 0.05) > Performance has improved. > leb128 u16 small time: [890.62 ns 898.04 ns 907.43 ns] > change: [−9.3899% −8.1392% −6.8999%] (p = 0.00 < 0.05) > Performance has improved. > parse .debug_info expressions > time: [128.24 µs 129.25 µs 130.30 µs] > change: [−13.582% −10.412% −7.0530%] (p = 0.00 < 0.05) > Performance has improved. > evaluate .debug_info expressions > time: [849.52 µs 854.60 µs 860.50 µs] > change: [−3.3028% +0.4156% +3.8783%] (p = 0.82 > 0.05) > No change in performance detected. Signed-off-by: Daniel Müller <deso@posteo.net>

philipc

Thanks! Yeah those traits rely a lot on the optimiser, there might more improvements to be found.

d-e-s-o added 2 commits April 14, 2026 11:47

philipc approved these changes Apr 15, 2026

View reviewed changes

philipc merged commit 3d74b4c into gimli-rs:main Apr 15, 2026
19 checks passed

d-e-s-o deleted the topic/optimize-leb128 branch April 15, 2026 00:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize unsigned LEB128 decoding more#875

Optimize unsigned LEB128 decoding more#875
philipc merged 2 commits into
gimli-rs:mainfrom
d-e-s-o:topic/optimize-leb128

d-e-s-o commented Apr 14, 2026

Uh oh!

philipc left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

d-e-s-o commented Apr 14, 2026

Uh oh!

philipc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants