Open
Conversation
pkolaczk
commented
Mar 12, 2026
- CNDB-17012: Improve TrieMemoryIndex row count estimation
- CNDB-16394: Fix possible deadlock if flush fails
- CNDB-16394: Add tests for SAI that use a larger amount of random data
added 3 commits
March 10, 2026 15:26
Fixes a bug in TermsDistribution#toBigDecimal which didn't preserve order for some types and broke the assertion in TrieMemoryIndex#estimateNumRowsMatchingRange. Additionally, replaces the row count estimation algorithm with a better one: - simpler - more efficient (no need to search and iterate the trie) - more accurate (especially for wider ranges)
`JVMStabilityInspector.inspectThrowable(t)` can throw an exception. In that case the control flow never reaches `postFlush.latch.countDown()` and threads waiting for flush to finish never unlock and never get a chance to propagate the exception up the call chain. This ends up in a bad lockup with no error logged anywhere, system appearing to be performing a pending flush, but no flush actually running. This scenario has been observed in tests when the system hit the limit of open files. This commit moves `latch.countDown()` to a finally block so it can never be skipped.
Some issues can never be found when testing SAI on a few manually inserted rows we had until now. During adding this set of tests it was already found that: - flushes may deadlock under specific failure scenarios - SAI tests were not cleaning the data after themselves which caused running out of file descriptors - some query optimizer code was computing incorrect estimates for some query bounds (CNDB-17012)
Checklist before you submit for review
|
Author
|
The newly added tests passed after reducing row count to 1000: http://10.169.74.112:8081/job/ds-cassandra-build/2122/ |
|
❌ Build ds-cassandra-pr-gate/PR-2269 rejected by Butler2 regressions found Found 2 new test failures
Found 7 known test failures |
Member
|
@pkolaczk Can you add proper PR description and also explain non-test changes? Can you also rebase and resolve the conflict on non-test file? |
Member
|
I think this PR needs to be split into separate PRs, e.g., CNDB-17012 and CNDB-16394 are not related to each other. I am not sure which of them gains from CNDB-16394. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.


