You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This implements fast constant-time modular inversion.
Preliminary benchmarks, without Assembly
On BLS12-381, this is almost 8x faster than Niels Möller algorithm (constant-time inversion in GMP) and Fermat's Little Theorem inversion with addition chains.
Correctly and efficiently implementing Pornin's for generic primes is actually tricky:
L22: (u, v) ← (uf₀ + vg₀ mod m, uf₁ + vg₁ mod m)
This requires efficient modular reduction. This is true for Generalized Mersenne Primes
like secp256k1 or ED25519 but not BLS12-381.
Given that Pornin's approach uses divsteps 31 instead of Bernstein 62 (on 64-bit)
a slow reduction will have twice the impact.
an extra bit in the high word for negative integers, making it unsuitable for secp256k1 or P256
when using a saturated representation.
In particular the inner loop needs to be as streamlined as possible, the lack of cmov and lzcount being platform-dependent makes the inner loop slow in pure Nim/C.
Regarding point 2, delayed/batched modular reduction alone can be done, however Pornin's method relies on an approximation of inputs that needs to be corrected at regular interval and at the computation end. Given the edge cases that popped up in BLST, delaying modular reduction AND correcting the approximation AND doing that constant-time seems fraught with peril.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This implements fast constant-time modular inversion.
Preliminary benchmarks, without Assembly
On BLS12-381, this is almost 8x faster than Niels Möller algorithm (constant-time inversion in GMP) and Fermat's Little Theorem inversion with addition chains.