Skip to content

optimize additive pubkey tweaking with vartime generator point multiplication (>80% speedup)#1843

Open
theStack wants to merge 4 commits intobitcoin-core:masterfrom
theStack:pubkey_tweak_add-boost
Open

optimize additive pubkey tweaking with vartime generator point multiplication (>80% speedup)#1843
theStack wants to merge 4 commits intobitcoin-core:masterfrom
theStack:pubkey_tweak_add-boost

Conversation

@theStack
Copy link
Copy Markdown
Contributor

@theStack theStack commented Apr 5, 2026

Additive public key tweaking (i.e. given a public key $P$ and a tweak scalar $t$, calculating $P' = P + t \cdot G$) is currently implemented using the "double multiply" algorithm secp256k1_ecmult ($R = na \cdot A + ng \cdot G$), with $na=1$ and $ng=t$:

secp256k1/src/ecmult.h

Lines 43 to 47 in 7262adb

/** Double multiply: R = na*A + ng*G
*
* Passing NULL as ng is equivalent to the zero scalar but a tiny bit faster.
*/
static void secp256k1_ecmult(secp256k1_gej *r, const secp256k1_gej *a, const secp256k1_scalar *na, const secp256k1_scalar *ng);

static int secp256k1_eckey_pubkey_tweak_add(secp256k1_ge *key, const secp256k1_scalar *tweak) {
secp256k1_gej pt;
secp256k1_gej_set_ge(&pt, key);
secp256k1_ecmult(&pt, &pt, &secp256k1_scalar_one, tweak);

After some tinkering I found that this could be sped up quite a bit. To take full use of the fact that the tweak scalar is typically not considered sensitive data (as opposed to secret keys or nonces), this PR first introduces a variable-time generator point multiplication routine secp256k1_ecmult_gen_var, which is essentially a copy of secp256k1_ecmult_gen [1], but with all the side-channel resistance mitigations stripped out. This function is significantly faster than its constant-time original in all COMB table size configurations, over 66% in all but the smallest 2KB setting (benchmarked via $ ./build/bin/bench_ecmult):

ECMULT_GEN_KB ecmult_gen ecmult_gen_var speedup
2 11.6 us 8.7 us ~33.3%
22 10.3 us 6.18 us ~66.6%
86 (default) 9.85 us 5.86 us ~68.1%

Plugging that new routine into the internal function secp256k1_eckey_pubkey_tweak_add function (which is easy as the vartime variant is stateless and thus doesn't need a context object anymore) yields the following speedups for the API function secp256k1_ec_pubkey_tweak_add (benchmarked via $ ./build/bin/bench tweak on the second vs. third commit):

ECMULT_GEN_KB ec_pubkey_tweak_add
(using ecmult)
ec_pubkey_tweak_add
(using new ecmult_gen_var)
speedup
2 16.3 us 11.6 us ~40.5%
22 16.3 us 8.98 us ~81.5%
86 (default) 16.3 us 8.65 us ~88.4%

Note that the improved code path (function secp256k1_eckey_pubkey_tweak_add) is also used in the following API functions, so they should all benefit from it:

Improving Silent Payments scanning performance was the original motivation to look into this (see also https://github.com/craigraw/bench_bip352/), and in fact the "common case" benchmarks show a 15-20% speedup if this branch is applied.

[1] I've also experimented with reintroducing the much simpler ECMULT_GEN_PREC_BITS precomputation table (which was used prior to SDMC in versions earlier than 0.5.0) and that performs even better with PREC_BITS=8: theStack@8f7fc93 (needing only 31 point additions rather than 42). However, given that it's unclear where to store this table data (having the choice of either bloating up the library size further or again needing some form of context, if it's calculated at runtime) I decided to stick with a solution that uses the already available table data first.

@sipa
Copy link
Copy Markdown
Contributor

sipa commented Apr 6, 2026

Neat.

My guess would be that even bigger SDMC tables would be even better, because the cost of constant-time table lookups disappears (which needs to reads through the entire table).

Comment thread src/ecmult_gen_impl.h Outdated
theStack added 4 commits April 8, 2026 18:28
Having this available in as a global constant allows to introduce
an alternative `ecmult_gen` function that doesn't need access to
a context, see next commit.

Note that the precomputed constant takes the name of the function
that previously generated it at run-time (`secp256k1_ecmult_gen_scalar_diff`),
while the function is now renamed to include the "compute" verb
(`secp256k1_ecmult_gen_compute_scalar_diff`), to match the naming
of the table generation function.
Add a faster variable-time variant for generator point multiplication.
This is essentially `ecmult_gen` without side-channel mitigations and
without requiring a context object. Intended for use cases where the
scalar is not representing sensitive data (e.g. pubkey tweaking).

On arm64, this is ~66% faster than the constant-time variant
(with the default build table size, i.e. ECMULT_GEN_KB=86):

$ ./build/bin/bench_ecmult
Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)
ecmult_gen                    ,     9.94      ,    10.0       ,    10.1
ecmult_gen_var                ,     5.96      ,     5.99      ,     6.02
.....
Rather than using the generic double multiply (with na=1), use the
newly introduced fast variable-time generator point multiplication
and add it up to the base public key manually.

On arm64, this improves the performance of the
`secp256k1_ec_pubkey_tweak_add` API function by about 80%:

----- Before (prior this commit): -----
```
$ ./build/bin/bench tweak
Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)

ec_pk_tweak_add               ,    16.1       ,    16.2       ,    16.6
```

----- After (this commit): -----
```
$ ./build/bin/bench tweak
Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)

ec_pk_tweak_add               ,     8.94      ,     8.98      ,     9.29
```

Note that the following API functions also benefit from the improved code path:
- secp256k1_xonly_pubkey_tweak_add
- secp256k1_xonly_pubkey_tweak_add_check
  (this one is consensus-critical for P2TR script path spends, see BIP341)
- secp256k1_keypair_xonly_tweak_add
- secp256k1_musig_pubkey_ec_tweak_add
- secp256k1_musig_pubkey_xonly_tweak_add
@theStack
Copy link
Copy Markdown
Contributor Author

theStack commented Apr 8, 2026

Added a preparatory commit that moves the ecmult_gen_scalar_diff constant precomputation from run-time to compile-time with the precompute_ecmult_gen binary (for the exhaustive tests though, its generated at run-time, similarly how its done with the the table), and addressed #1843 (comment).

My guess would be that even bigger SDMC tables would be even better, because the cost of constant-time table lookups disappears (which needs to reads through the entire table).

Ah, indeed! I've tried with table sizes up to ~18 MB, and at least on my arm64 machine, the ~5MB one (GEN_KB=5120) yields the best results (see theStack@aa1a4e7 and $ ./bench_gen_var_with_all_tables.sh):

ECMULT_GEN_KB ecmult_gen ecmult_gen_var ec_pubkey_tweak_add speedup to unoptimized ec_pubkey_tweak_add (16.3 us)
2 11.7 us 8.81 us 11.7 us 39.3%
22 10.5 us 6.31 us 9.05 us 80.1%
86 9.96 us 5.94 us 8.67 us 88.0%
148 11.4 us 5.18 us 7.91 us 106.1%
256 14.3 us 4.46 us 7.22 us 125.8%
464 21.1 us 4.04 us 6.85 us 138.0%
832 33.1 us 3.66 us 6.46 us 152.3%
1536 61.2 us 3.46 us 6.21 us 162.5%
2816 103.0 us 3.13 us 6.02 us 170.8%
5120 188.0 us 2.85 us 5.80 us 181.0%
9728 371.0 us 2.95 us 6.33 us 157.5%
18432 506.0 us 4.02 us 7.32 us 122.7%

So we would need to embed two different tables to get the most out of it (without degrading the signing/nonce generation speed). Not sure how performance-critical pubkey tweaking is though in practice, the only currently widespread scenario I could think of is verification of taproot commitments. For silentpayments I still have to check how these larger table values would affect overall performance.

@w0xlt
Copy link
Copy Markdown
Contributor

w0xlt commented Apr 9, 2026

Concept ACK

@sipa
Copy link
Copy Markdown
Contributor

sipa commented Apr 10, 2026

@theStack Yeah, I don't think it's worth having two built-in tables just for this, but it's good to see my intuition confirmed.

Copy link
Copy Markdown
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code can be deduplicated.

diff
diff --git a/src/eckey_impl.h b/src/eckey_impl.h
index 9cee88d..566a318 100644
--- a/src/eckey_impl.h
+++ b/src/eckey_impl.h
@@ -61,6 +61,7 @@ static int secp256k1_eckey_privkey_tweak_add(secp256k1_scalar *key, const secp25
 
 static int secp256k1_eckey_pubkey_tweak_add(secp256k1_ge *key, const secp256k1_scalar *tweak) {
     secp256k1_gej pt;
+    /* `tweak` is public here, so the variable-time generator multiplication is safe. */
     secp256k1_ecmult_gen_var(&pt, tweak);
     secp256k1_gej_add_ge_var(&pt, &pt, key, NULL);
 
diff --git a/src/ecmult_gen.h b/src/ecmult_gen.h
index 74942ed..567cc48 100644
--- a/src/ecmult_gen.h
+++ b/src/ecmult_gen.h
@@ -138,6 +138,10 @@ static void secp256k1_ecmult_gen_context_clear(secp256k1_ecmult_gen_context* ctx
 
 /** Multiply with the generator: R = a*G */
 static void secp256k1_ecmult_gen(const secp256k1_ecmult_gen_context* ctx, secp256k1_gej *r, const secp256k1_scalar *a);
+/** Multiply with the generator: R = a*G.
+ *
+ * Variable-time implementation. Only call this for public/non-secret scalars.
+ */
 static void secp256k1_ecmult_gen_var(secp256k1_gej *r, const secp256k1_scalar *a);
 
 static void secp256k1_ecmult_gen_blind(secp256k1_ecmult_gen_context *ctx, const secp256k1_hash_ctx *hash_ctx, const unsigned char *seed32);
diff --git a/src/ecmult_gen_impl.h b/src/ecmult_gen_impl.h
index 8e4209f..3c4460e 100644
--- a/src/ecmult_gen_impl.h
+++ b/src/ecmult_gen_impl.h
@@ -30,17 +30,89 @@ static void secp256k1_ecmult_gen_context_clear(secp256k1_ecmult_gen_context *ctx
     secp256k1_fe_clear(&ctx->proj_blind);
 }
 
+/* Convert a scalar to a zero-padded array of 32-bit words. Only the first 8 words
+ * can contain non-zero data, but the padding avoids out-of-bounds reads from the
+ * scalar when COMB_BITS > 256. */
+static void secp256k1_ecmult_gen_scalar_to_recoded(uint32_t recoded[(COMB_BITS + 31) >> 5], const secp256k1_scalar *s) {
+    int i;
+
+    memset(recoded, 0, ((COMB_BITS + 31) >> 5) * sizeof(*recoded));
+    for (i = 0; i < 8 && i < ((COMB_BITS + 31) >> 5); ++i) {
+        recoded[i] = secp256k1_scalar_get_bits_limb32(s, 32 * i, 32);
+    }
+}
+
+/* Gather the mask(block)-selected bits of the recoded scalar into a packed value:
+ * bits[tooth] = d[(block*COMB_TEETH + tooth)*COMB_SPACING + comb_off].
+ *
+ * This constant-time variant mirrors the power side-channel hardening used by
+ * secp256k1_ecmult_gen. */
+static uint32_t secp256k1_ecmult_gen_lookup_bits(const uint32_t recoded[(COMB_BITS + 31) >> 5], uint32_t comb_off, uint32_t block) {
+    uint32_t bits = 0;
+    uint32_t bit_pos = comb_off + block * COMB_TEETH * COMB_SPACING;
+    uint32_t tooth;
+
+    /* Instead of reading individual bits here to construct the bits variable,
+     * build up the result by xoring rotated reads together. In every iteration,
+     * one additional bit is made correct, starting at the bottom. The bits
+     * above that contain junk. This reduces leakage by avoiding computations
+     * on variables that can have only a low number of possible values (e.g.,
+     * just two values when reading a single bit into a variable.) See:
+     * https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-alam.pdf
+     */
+    for (tooth = 0; tooth < COMB_TEETH; ++tooth) {
+        /* Construct bitdata s.t. the bottom bit is the bit we'd like to read.
+         *
+         * We could just set bitdata = recoded[bit_pos >> 5] >> (bit_pos & 0x1f)
+         * but this would simply discard the bits that fall off at the bottom,
+         * and thus, for example, bitdata could still have only two values if we
+         * happen to shift by exactly 31 positions. We use a rotation instead,
+         * which ensures that bitdata doesn't lose entropy. This relies on the
+         * rotation being atomic, i.e., the compiler emitting an actual rot
+         * instruction. */
+        uint32_t bitdata = secp256k1_rotr32(recoded[bit_pos >> 5], bit_pos & 0x1f);
+
+        /* Clear the bit at position tooth, but sssh, don't tell clang. */
+        uint32_t volatile vmask = ~(1 << tooth);
+        bits &= vmask;
+
+        /* Write the bit into position tooth (and junk into higher bits). */
+        bits ^= bitdata << tooth;
+        bit_pos += COMB_SPACING;
+    }
+    return bits;
+}
+
+/* Same bit packing as secp256k1_ecmult_gen_lookup_bits(), but without the
+ * constant-time/power-analysis hardening as the scalar is public here. */
+static uint32_t secp256k1_ecmult_gen_lookup_bits_var(const uint32_t recoded[(COMB_BITS + 31) >> 5], uint32_t comb_off, uint32_t block) {
+    uint32_t bits = 0;
+    uint32_t bit_pos = comb_off + block * COMB_TEETH * COMB_SPACING;
+    uint32_t tooth;
+
+    for (tooth = 0; tooth < COMB_TEETH; ++tooth) {
+        uint32_t bit = (recoded[bit_pos >> 5] >> (bit_pos & 0x1f)) & 1;
+        bits |= bit << tooth;
+        bit_pos += COMB_SPACING;
+    }
+    return bits;
+}
+
+static void secp256k1_ecmult_gen_lookup_table_index(uint32_t bits, uint32_t *sign, uint32_t *abs) {
+    *sign = (bits >> (COMB_TEETH - 1)) & 1;
+    *abs = (bits ^ -*sign) & (COMB_POINTS - 1);
+    VERIFY_CHECK(*sign == 0 || *sign == 1);
+    VERIFY_CHECK(*abs < COMB_POINTS);
+}
+
 static void secp256k1_ecmult_gen(const secp256k1_ecmult_gen_context *ctx, secp256k1_gej *r, const secp256k1_scalar *gn) {
     uint32_t comb_off;
     secp256k1_ge add;
     secp256k1_fe neg;
     secp256k1_ge_storage adds;
     secp256k1_scalar d;
-    /* Array of uint32_t values large enough to store COMB_BITS bits. Only the bottom
-     * 8 are ever nonzero, but having the zero padding at the end if COMB_BITS>256
-     * avoids the need to deal with out-of-bounds reads from a scalar. */
-    uint32_t recoded[(COMB_BITS + 31) >> 5] = {0};
-    int first = 1, i;
+    uint32_t recoded[(COMB_BITS + 31) >> 5];
+    int first = 1;
 
     memset(&adds, 0, sizeof(adds));
 
@@ -88,9 +160,7 @@ static void secp256k1_ecmult_gen(const secp256k1_ecmult_gen_context *ctx, secp25
     /* Compute the scalar d = (gn + ctx->scalar_offset). */
     secp256k1_scalar_add(&d, &ctx->scalar_offset, gn);
     /* Convert to recoded array. */
-    for (i = 0; i < 8 && i < ((COMB_BITS + 31) >> 5); ++i) {
-        recoded[i] = secp256k1_scalar_get_bits_limb32(&d, 32 * i, 32);
-    }
+    secp256k1_ecmult_gen_scalar_to_recoded(recoded, &d);
     secp256k1_scalar_clear(&d);
 
     /* In secp256k1_ecmult_gen_prec_table we have precomputed sums of the
@@ -171,47 +241,11 @@ static void secp256k1_ecmult_gen(const secp256k1_ecmult_gen_context *ctx, secp25
     comb_off = COMB_SPACING - 1;
     while (1) {
         uint32_t block;
-        uint32_t bit_pos = comb_off;
         /* Inner loop: for each block, add table entries to the result. */
         for (block = 0; block < COMB_BLOCKS; ++block) {
-            /* Gather the mask(block)-selected bits of d into bits. They're packed:
-             * bits[tooth] = d[(block*COMB_TEETH + tooth)*COMB_SPACING + comb_off]. */
-            uint32_t bits = 0, sign, abs, index, tooth;
-            /* Instead of reading individual bits here to construct the bits variable,
-             * build up the result by xoring rotated reads together. In every iteration,
-             * one additional bit is made correct, starting at the bottom. The bits
-             * above that contain junk. This reduces leakage by avoiding computations
-             * on variables that can have only a low number of possible values (e.g.,
-             * just two values when reading a single bit into a variable.) See:
-             * https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-alam.pdf
-             */
-            for (tooth = 0; tooth < COMB_TEETH; ++tooth) {
-                /* Construct bitdata s.t. the bottom bit is the bit we'd like to read.
-                 *
-                 * We could just set bitdata = recoded[bit_pos >> 5] >> (bit_pos & 0x1f)
-                 * but this would simply discard the bits that fall off at the bottom,
-                 * and thus, for example, bitdata could still have only two values if we
-                 * happen to shift by exactly 31 positions. We use a rotation instead,
-                 * which ensures that bitdata doesn't lose entropy. This relies on the
-                 * rotation being atomic, i.e., the compiler emitting an actual rot
-                 * instruction. */
-                uint32_t bitdata = secp256k1_rotr32(recoded[bit_pos >> 5], bit_pos & 0x1f);
-
-                /* Clear the bit at position tooth, but sssh, don't tell clang. */
-                uint32_t volatile vmask = ~(1 << tooth);
-                bits &= vmask;
-
-                /* Write the bit into position tooth (and junk into higher bits). */
-                bits ^= bitdata << tooth;
-                bit_pos += COMB_SPACING;
-            }
-
-            /* If the top bit of bits is 1, flip them all (corresponding to looking up
-             * the negated table value), and remember to negate the result in sign. */
-            sign = (bits >> (COMB_TEETH - 1)) & 1;
-            abs = (bits ^ -sign) & (COMB_POINTS - 1);
-            VERIFY_CHECK(sign == 0 || sign == 1);
-            VERIFY_CHECK(abs < COMB_POINTS);
+            uint32_t bits, sign, abs, index;
+            bits = secp256k1_ecmult_gen_lookup_bits(recoded, comb_off, block);
+            secp256k1_ecmult_gen_lookup_table_index(bits, &sign, &abs);
 
             /** This uses a conditional move to avoid any secret data in array indexes.
              *   _Any_ use of secret indexes has been demonstrated to result in timing
@@ -275,38 +309,22 @@ static void secp256k1_ecmult_gen_var(secp256k1_gej *r, const secp256k1_scalar *g
     uint32_t comb_off;
     secp256k1_ge add;
     secp256k1_scalar d;
-    uint32_t recoded[(COMB_BITS + 31) >> 5] = {0};
-    int i;
+    uint32_t recoded[(COMB_BITS + 31) >> 5];
 
     /* Adjust input scalar for difference and convert to recoded array. */
     secp256k1_scalar_add(&d, &secp256k1_ecmult_gen_scalar_diff, gn);
-    for (i = 0; i < 8 && i < ((COMB_BITS + 31) >> 5); ++i) {
-        recoded[i] = secp256k1_scalar_get_bits_limb32(&d, 32 * i, 32);
-    }
+    secp256k1_ecmult_gen_scalar_to_recoded(recoded, &d);
 
     /* Outer loop: iterate over comb_off from COMB_SPACING - 1 down to 0. */
     secp256k1_gej_set_infinity(r);
     comb_off = COMB_SPACING - 1;
     while (1) {
         uint32_t block;
-        uint32_t bit_pos = comb_off;
         /* Inner loop: for each block, add table entries to the result. */
         for (block = 0; block < COMB_BLOCKS; ++block) {
-            /* Gather the mask(block)-selected bits of d into bits. They're packed:
-             * bits[tooth] = d[(block*COMB_TEETH + tooth)*COMB_SPACING + comb_off]. */
-            uint32_t bits = 0, sign, abs, tooth;
-            for (tooth = 0; tooth < COMB_TEETH; ++tooth) {
-                uint32_t bit = (recoded[bit_pos >> 5] >> (bit_pos & 0x1f)) & 1;
-                bits |= bit << tooth;
-                bit_pos += COMB_SPACING;
-            }
-
-            /* If the top bit of bits is 1, flip them all (corresponding to looking up
-             * the negated table value), and remember to negate the result in sign. */
-            sign = (bits >> (COMB_TEETH - 1)) & 1;
-            abs = (bits ^ -sign) & (COMB_POINTS - 1);
-            VERIFY_CHECK(sign == 0 || sign == 1);
-            VERIFY_CHECK(abs < COMB_POINTS);
+            uint32_t bits, sign, abs;
+            bits = secp256k1_ecmult_gen_lookup_bits_var(recoded, comb_off, block);
+            secp256k1_ecmult_gen_lookup_table_index(bits, &sign, &abs);
 
             /* Perform lookup, negate if necessary and add to r. */
             secp256k1_ge_from_storage(&add, &secp256k1_ecmult_gen_prec_table[block][abs]);

Copy link
Copy Markdown
Contributor

@w0xlt w0xlt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 0ebb1aa mod above nit

@real-or-random
Copy link
Copy Markdown
Contributor

to take full use of the fact that the tweak scalar is typically not considered sensitive data

The word "typically" is what worries me here. We don't really know what people use these functions for. What if they do have secret tweaks?

And isn't the tweak secret even in known applications, e.g., Taproot tweaking, say Q = P + tG? One security feature of Taproot is that it hides the Merkle root until a script-path spend. This only works if the tweak t is secret. If t is public, then the attacker can get P = Q - tG and search for a Merkle root m such that t = hash(P, m).

@theStack
Copy link
Copy Markdown
Contributor Author

to take full use of the fact that the tweak scalar is typically not considered sensitive data

The word "typically" is what worries me here. We don't really know what people use these functions for. What if they do have secret tweaks?

Hm good point, the "if something sounds too good to be true, it probably is" proverb comes to my mind 😅
Fwiw, what made me believe that the change is fine (in the sense that it doesn't make things worse w.r.t. security to what we have now) is that currently public key tweaking is not side-channel resistant either, as variable-time scalar multiplication (ecmult) is used and the tweak scalar isn't cleared out anywhere. Seems like it would be better to go into the other direction and use ecmult_gen then? (IIRC, I experimented with that a while ago, and it was still a few micro-seconds faster than the current code).

And isn't the tweak secret even in known applications, e.g., Taproot tweaking, say Q = P + tG? One security feature of Taproot is that it hides the Merkle root until a script-path spend. This only works if the tweak t is secret. If t is public, then the attacker can get P = Q - tG and search for a Merkle root m such that t = hash(P, m).

Makes sense. For verifying taproot tweaks though, i.e. secp256k1_xonly_pubkey_tweak_add_check, treating the tweak as public seems to be fine though, as that one is called after revealing the merkle root? Maybe it's even worth it to have two different eckey_pubkey_tweak_add variants internally, one secure and one fast? The former could be used for creating tweaks, the latter for verifying them, or for special use cases like Silent Payments.

@sipa
Copy link
Copy Markdown
Contributor

sipa commented Apr 13, 2026

One security feature of Taproot is that it hides the Merkle root until a script-path spend.

Hmm, I would consider that a privacy feature, and not a security feature. Generally, taproot script trees will involve multiple distrusted participants, and the leaf/script structure will be shared with them. This doesn't work if the script tree is supposed to be secret from a security perspective.

And I think that we're generally not aiming to use constant time operations to protect privacy. Otherwise, it literally applies to everything. You may well want to keep your public keys private, so all functions involving public keys should be constant time?

It may make sense to have a mode/extension/module for "do everything constant time", but I don't think it aligns with how the library has operated so far.

sedited added a commit to bitcoin/bitcoin that referenced this pull request Apr 19, 2026
…t-path spends

fbffe8a bench: improve `VerifyNestedIfScript` benchmark precision (make stack clearing untimed) (David Gumberg)
616ee6f bench: add script verification benchmark for P2TR script-path spends (Sebastian Falbesoner)

Pull request description:

  Similarly as #34472 already did for key-path spends, this PR adds a benchmark for P2TR script-path spends. So far we don't have a benchmark on master yet that exercises the verification of taproot commitments ([`VerifyTaprootCommitment`](https://github.com/bitcoin/bitcoin/blob/141fbe4d530b51345e62dee1348e82d8a0406ffc/src/script/interpreter.cpp#L1903)).

  Note that the tapscript leaf intentionally includes a single OP_CHECKSIG as it likely reflects the real world best. Spending tapscript leafs without any signature checks don't make much sense (they could be trivially tampered with and thus stolen by miners), and doing more than one signature check seems the exception rather than the rule.

  The primary motivation for this PR is to evaluate how potential secp256k1 changes in pubkey tweaking (e.g. [#1843](bitcoin-core/secp256k1#1843)) may impact script verification performance.

ACKs for top commit:
  davidgumberg:
    reACK fbffe8a
  l0rinc:
    ACK fbffe8a
  sedited:
    ACK fbffe8a

Tree-SHA512: 6fa1f2c336d6332b4f2d22173279ee29ad3ec5e5431109913a6978fef32e22a34d3247729ac9092bfdbbcd8f9dfcad8da50bc4ede2cb7efc1f93d6a744ddf41b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants