enable optimizer in tests by alexpyattaev · Pull Request #5155 · anza-xyz/agave

alexpyattaev · 2025-03-05T17:34:40Z

Problem

Unittests are very slow: optimization is not enabled for them at all, so we wait hours for CI to complete
Unittests do not catch optimizer-induced UB (as optimizer is not enabled) so much of the unsound unsafe code gets through unittests

Summary of Changes

Enable optimizer for unittests

alexpyattaev · 2025-03-05T18:15:57Z

alexpyattaev · 2025-03-10T19:10:30Z

Disabled optimizations for some crates until a more permanent solution for them can be found to tests there. Now that #5212 has landed accountsdb is good to go.

yihau

LGTM. however, I think we have some unsolved discussions 🤔 @ilya-bobyr do you still think we need to use a custom profile? or should we give this one a try. if things get worse, we can always revert.

steviez

Out of curiosity, do you have some numbers you can share for overall CI ? No worries if not, mostly curious and faster is better (as long as we aren't compromising on flakiness)

alexpyattaev · 2025-03-11T18:56:44Z

To clarify about the speed improvements (this is for runtime, not compile time) you may refer to some timings I've collected below.

TL;DR - CPU bound tests become 4x faster, IO bound tests become 25% faster.

HOWEVER the actual time to complete all tests will not change much as it is dominated by the slowest test in any given set.

In the end this change will not cut CI time by 4x, but it will cut CPU usage in CI quite substantially, allowing us to run a larger volume of small tests in parallel.

As noted above, compile time difference for full rebuild of agave is 15 seconds (10% more), for incremental build of one unittest the change is not perceptable.

time RUST_LOG="error" cargo nextest run -p solana-accounts-db
________________________optimized ________________________________
Executed in   30.38 secs    fish           external
   usr time   57.29 secs    0.00 micros   57.29 secs
   sys time   84.66 secs  895.00 micros   84.66 secs

______________________baseline__________________________________
Executed in   33.08 secs    fish           external
   usr time  255.77 secs    0.00 micros  255.77 secs
   sys time   89.46 secs  964.00 micros   89.46 secs

time RUST_LOG="error" cargo nextest run -p solana-core
_______________________optimized_________________________________
Executed in  411.47 secs    fish           external
   usr time   16.74 mins    0.00 micros   16.74 mins
   sys time    4.00 mins  906.00 micros    4.00 mins

_______________________baseline_________________________________
Executed in  411.63 secs    fish           external
   usr time   43.01 mins    0.00 micros   43.01 mins
   sys time    3.66 mins  896.00 micros    3.66 mins


time RUST_LOG="error" cargo nextest run -p solana-tpu-client-next
________________________optimized________________________________
Executed in    4.58 secs    fish           external
   usr time    1.09 secs  392.00 micros    1.09 secs
   sys time    0.71 secs  237.00 micros    0.71 secs

_____________________baseline___________________________________
Executed in    4.60 secs    fish           external
   usr time    1.25 secs  380.00 micros    1.25 secs
   sys time    0.82 secs  234.00 micros    0.82 secs

anza-team · 2025-03-11T18:57:46Z

😱 New commits were pushed while the automerge label was present.

illia-bobyr · 2025-03-11T19:22:07Z

To clarify about the speed improvements (this is for runtime, not compile time) you may refer to some timings I've collected below.

TL;DR - CPU bound tests become 4x faster, IO bound tests become 25% faster.

HOWEVER the actual time to complete all tests will not change much as it is dominated by the slowest test in any given set.

In the end this change will not cut CI time by 4x, but it will cut CPU usage in CI quite substantially, allowing us to run a larger volume of small tests in parallel.

As noted above, compile time difference for full rebuild of agave is 15 seconds (10% more), for incremental build of one unittest the change is not perceptable.

time RUST_LOG="error" cargo nextest run -p solana-accounts-db
________________________optimized ________________________________
Executed in   30.38 secs    fish           external
   usr time   57.29 secs    0.00 micros   57.29 secs
   sys time   84.66 secs  895.00 micros   84.66 secs

______________________baseline__________________________________
Executed in   33.08 secs    fish           external
   usr time  255.77 secs    0.00 micros  255.77 secs
   sys time   89.46 secs  964.00 micros   89.46 secs

time RUST_LOG="error" cargo nextest run -p solana-core
_______________________optimized_________________________________
Executed in  411.47 secs    fish           external
   usr time   16.74 mins    0.00 micros   16.74 mins
   sys time    4.00 mins  906.00 micros    4.00 mins

_______________________baseline_________________________________
Executed in  411.63 secs    fish           external
   usr time   43.01 mins    0.00 micros   43.01 mins
   sys time    3.66 mins  896.00 micros    3.66 mins


time RUST_LOG="error" cargo nextest run -p solana-tpu-client-next
________________________optimized________________________________
Executed in    4.58 secs    fish           external
   usr time    1.09 secs  392.00 micros    1.09 secs
   sys time    0.71 secs  237.00 micros    0.71 secs

_____________________baseline___________________________________
Executed in    4.60 secs    fish           external
   usr time    1.25 secs  380.00 micros    1.25 secs
   sys time    0.82 secs  234.00 micros    0.82 secs

Thank you for checking the incremental rebuild times.

I do not think our CI is running more than one job per build machine.
This is related to race conditions and time sensitivity in some tests.
This is my knowledge as of some time ago, when we actually tried to pack more jobs into our team dedicated test cluster, and it didn't work.
Things might have changed since then.

I do support more efficient resource usage.
But just want to make sure this is still the right optimization.

The way I understand it right now is:

Wall clock for the CI execution will stay the same.
We are not using build agents for more than one job at a time, so the CPU savings will not be obtainable.
(But, it would be nice to double-check this).
You also said that we may potentially degrade the interactive debugging experience.
(Though, I do not know the extent of this degradation at the lower optimization levels).

Do you think it is still the right change to make?

alexpyattaev · 2025-03-11T19:46:12Z

I do not think our CI is running more than one job per build machine. This is related to race conditions and time sensitivity in some tests. This is my knowledge as of some time ago, when we actually tried to pack more jobs into our team dedicated test cluster, and it didn't work. Things might have changed since then.

If CI is running tests sequentially, they will go 4x faster. If CI runs tests in parallel, the wall clock improvements will be marginal. From what I have seen so far, CI is running most tests in parallel.

Do you think it is still the right change to make?

It is 100% worth it just for the sake of not allowing compiler-induced UB to pass the test suites. "you test what you ship" is a thing for a reason. Arguably, a seprate CI profile should probably be made with the same build settings as --release.

illia-bobyr · 2025-03-12T01:19:43Z

I do not think our CI is running more than one job per build machine. This is related to race conditions and time sensitivity in some tests. This is my knowledge as of some time ago, when we actually tried to pack more jobs into our team dedicated test cluster, and it didn't work. Things might have changed since then.

If CI is running tests sequentially, they will go 4x faster. If CI runs tests in parallel, the wall clock improvements will be marginal. From what I have seen so far, CI is running most tests in parallel.

Do you think it is still the right change to make?

It is 100% worth it just for the sake of not allowing compiler-induced UB to pass the test suites. "you test what you ship" is a thing for a reason. Arguably, a seprate CI profile should probably be made with the same build settings as --release.

I think having a separate CI profile is the best option.
Such that developers run local tests with low (or no) optimization, and CI runs the test faster and with a different optimization level.
I am still somewhat skeptical that any given optimization level is considerably more likely to catch UB than any other level.
But if we run the code at two different levels, we do have a chance to notice at least the deterministic things.

At the same time, adding a CI specific profile is probably more work than just changing the existing profile.
So I can understand the desire to merge a simpler change.
If you are going to explore an alternative profile approach - maybe we could try that instead of this PR.
But if you are not going to do it, maybe we could merge this change as is.

codecov-commenter · 2026-02-25T06:40:01Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.1%. Comparing base (a94c35e) to head (2fddaec).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files

@@           Coverage Diff           @@
##           master    #5155   +/-   ##
=======================================
  Coverage    83.1%    83.1%           
=======================================
  Files         837      837           
  Lines      316869   316869           
=======================================
+ Hits       263476   263507   +31     
+ Misses      53393    53362   -31

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

yihau

Thanks for reviving this improvement. Let's do it!

steviez

I assume @yihau will be happy with the changes, but maybe we give him one more chance to look. Otherwise, I think we've addressed all known issues and this will make CI match "production" a bit more closely.

I wouldn't be surprised if some tests get more flaky as a result of this. But, I'm inclined to think that any issues like that would be from the test being inherently flaky

One minor nit - can you update the PR description to reflect the change in direction. Namely, the PR is now only impacting CI and no longer doing "Rebuilds are slow: optimizations for procmacros are not enabled either" for regular dev flow

alexpyattaev · 2026-03-13T11:32:16Z

I wouldn't be surprised if some tests get more flaky as a result of this. But, I'm inclined to think that any issues like that would be from the test being inherently flaky

That is sort of the point. If optimizer adds UB we will have a chance to see it.

One minor nit - can you update the PR description to reflect the change in direction. Namely, the PR is now only impacting CI and no longer doing "Rebuilds are slow: optimizations for procmacros are not enabled either" for regular dev flow

Done!

yihau

🔥

alexpyattaev mentioned this pull request Mar 5, 2025

enable optimizer in tests and procmacros anza-xyz/solana-sdk#67

Merged

Lichtso mentioned this pull request Mar 5, 2025

Fix - test_syscall_get_sysvar #5161

Merged

alexpyattaev force-pushed the enable_optimizer_in_tests branch from cebe934 to 3a892cc Compare March 7, 2025 20:26

alexpyattaev commented Mar 7, 2025

View reviewed changes

Comment thread accounts-db/src/append_vec.rs Outdated

alexpyattaev commented Mar 7, 2025

View reviewed changes

Comment thread accounts-db/src/append_vec.rs Outdated

illia-bobyr reviewed Mar 7, 2025

View reviewed changes

Comment thread Cargo.toml Outdated

illia-bobyr reviewed Mar 7, 2025

View reviewed changes

Comment thread accounts-db/src/append_vec.rs Outdated

alexpyattaev force-pushed the enable_optimizer_in_tests branch 2 times, most recently from 0e9cab7 to cd25b7a Compare March 10, 2025 14:58

apfitzge reviewed Mar 10, 2025

View reviewed changes

Comment thread accounts-db/src/append_vec.rs Outdated

Comment thread accounts-db/src/append_vec.rs Outdated

alexpyattaev force-pushed the enable_optimizer_in_tests branch from cd25b7a to fb1a433 Compare March 10, 2025 16:41

alexpyattaev marked this pull request as ready for review March 10, 2025 18:48

alexpyattaev requested review from apfitzge and illia-bobyr March 10, 2025 18:49

alexpyattaev force-pushed the enable_optimizer_in_tests branch from fb1a433 to f218da0 Compare March 10, 2025 20:05

alexpyattaev requested review from roryharr and steviez March 10, 2025 22:05

alexpyattaev added the automerge automerge Merge this Pull Request automatically once CI passes label Mar 10, 2025

alexpyattaev requested review from KirillLykov and yihau March 11, 2025 07:29

yihau reviewed Mar 11, 2025

View reviewed changes

Comment thread gossip/src/cluster_info.rs Outdated

steviez reviewed Mar 11, 2025

View reviewed changes

Comment thread Cargo.toml Outdated

Comment thread gossip/src/cluster_info.rs Outdated

anza-team removed the automerge automerge Merge this Pull Request automatically once CI passes label Mar 11, 2025

alexpyattaev force-pushed the enable_optimizer_in_tests branch from 09ac822 to 98b9da4 Compare April 14, 2025 07:58

alexpyattaev force-pushed the enable_optimizer_in_tests branch 2 times, most recently from 55884a8 to 439165a Compare April 24, 2025 14:23

alexpyattaev force-pushed the enable_optimizer_in_tests branch from 439165a to 45b08f9 Compare August 8, 2025 20:23

roryharr removed their request for review October 30, 2025 23:08

github-actions Bot added the stale label Feb 12, 2026

alexpyattaev removed the stale label Feb 12, 2026

alexpyattaev force-pushed the enable_optimizer_in_tests branch from 7740d7b to 4d6a474 Compare February 12, 2026 13:04

alexpyattaev force-pushed the enable_optimizer_in_tests branch 2 times, most recently from a0c49fa to e7ae94e Compare February 25, 2026 06:09

alexpyattaev force-pushed the enable_optimizer_in_tests branch from e7ae94e to c5e258f Compare February 25, 2026 14:17

anza-xyz deleted a comment from github-actions Bot Feb 25, 2026

alexpyattaev requested a review from steviez February 25, 2026 15:51

alexpyattaev requested a review from yihau March 7, 2026 21:53

yihau previously approved these changes Mar 9, 2026

View reviewed changes

steviez reviewed Mar 9, 2026

View reviewed changes

Comment thread Cargo.toml Outdated

steviez reviewed Mar 9, 2026

View reviewed changes

Comment thread Cargo.toml Outdated

enable optimizer in tests

2fddaec

alexpyattaev dismissed yihau’s stale review via 2fddaec March 9, 2026 22:08

alexpyattaev force-pushed the enable_optimizer_in_tests branch from c5e258f to 2fddaec Compare March 9, 2026 22:08

alexpyattaev changed the title ~~enable optimizer in tests and procmacros~~ enable optimizer in tests Mar 9, 2026

alexpyattaev requested a review from steviez March 9, 2026 22:09

steviez approved these changes Mar 9, 2026

View reviewed changes

yihau approved these changes Mar 13, 2026

View reviewed changes

alexpyattaev added this pull request to the merge queue Mar 13, 2026

Merged via the queue into anza-xyz:master with commit a436ddd Mar 13, 2026
63 checks passed

alexpyattaev deleted the enable_optimizer_in_tests branch March 13, 2026 13:53

Conversation

alexpyattaev commented Mar 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Summary of Changes

Uh oh!

alexpyattaev commented Mar 5, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexpyattaev commented Mar 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yihau left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

steviez left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

alexpyattaev commented Mar 11, 2025

Uh oh!

anza-team commented Mar 11, 2025

Uh oh!

illia-bobyr commented Mar 11, 2025

Uh oh!

alexpyattaev commented Mar 11, 2025

Uh oh!

illia-bobyr commented Mar 12, 2025

Uh oh!

codecov-commenter commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

yihau left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

steviez left a comment

Choose a reason for hiding this comment

Uh oh!

alexpyattaev commented Mar 13, 2026

Uh oh!

yihau left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

alexpyattaev commented Mar 5, 2025 •

edited

Loading

alexpyattaev commented Mar 10, 2025 •

edited

Loading

codecov-commenter commented Feb 25, 2026 •

edited

Loading