perf(griffin): skip Arrays.fill on always-empty optimiser maps in clear()#16
Open
mashraf-222 wants to merge 1 commit intomasterfrom
Open
perf(griffin): skip Arrays.fill on always-empty optimiser maps in clear()#16mashraf-222 wants to merge 1 commit intomasterfrom
mashraf-222 wants to merge 1 commit intomasterfrom
Conversation
…ar() SqlOptimiser.clear() is called at the start of every SqlCompilerImpl compile operation and every testParseExpression invocation. It iterates 25+ collection clears plus two helper cascades (clearForUnionModelInJoin, clearWindowFunctionHashMap) and LateralJoinRewriter.clear(). For expression-only parse paths (parser.expr) none of the SqlOptimiser fields are populated because ExpressionParser never touches the optimiser; yet each hashmap/hashset still runs an unconditional Arrays.fill on its backing array. ObjectPool.clear(), ObjList.clear(), IntList.clear(), BoolList.clear(), StringSink.clear(), and CharacterStore.clear() are already cheap (pos/size reset), or have built-in pos>0 guards. The expensive cases are the hashmap/hashset types: IntHashSet, CharSequenceHashSet, AbstractCharSequenceHashSet, AbstractLowerCaseCharSequenceHashSet, LowerCaseCharSequenceIntHashMap, LowerCaseCharSequenceObjHashMap, IntObjHashMap, ObjHashSet. All of these support capacity-free size() queries (capacity - free). Guard those clears with size()>0 at three levels: - SqlOptimiser.clear() (10 fields) - SqlOptimiser.clearForUnionModelInJoin() (3 fields) - SqlOptimiser.clearWindowFunctionHashMap() (2 of 3 fields; the third is an ObjectPool that is already cheap) - LateralJoinRewriter.clear() (sharedModels is an ObjHashSet) When the maps/sets are empty the guard collapses to an inexpensive branch, and skipping the Arrays.fill removes a large chunk of retired instructions from the hot path. Benchmark (ExpressionParserBenchmark.testParseExpression): JDK 21.0.10, @fork 3, wi 5, i 10, r 1s, w 1s, idle machine. expression base ns opt ns delta% ------------------------------------------------------- -------- ------- ------- a + b 750.2 483.6 -35.54 a + b * c / 2 1096.8 795.5 -27.47 a + b * c(x, y) / 2 1421.0 1120.9 -21.12 a = 1 and b = 2 or c = 3 1499.3 1196.3 -20.21 case when a > 0 then 'positive'...'zero' end 2383.5 2068.7 -13.21 a in (1, 2, 3, 4, 5) 1215.1 958.5 -21.11 cast(a as double) + cast(b as long) 1733.1 1494.9 -13.74 a between 1 and 10 and b like '%test%' 1494.8 1221.4 -18.29 coalesce(a, b, c, d, 0) 1182.1 904.9 -23.45 sum(a) over (partition by...current row) 2503.8 2231.9 -10.86 ------------------------------------------------------- -------- ------- ------- geometric mean 1442.2 1142.0 -20.82 All 10 expressions show non-overlapping confidence intervals at the 99.9% JMH error level. No losses, no neutral cases. Tests: SqlOptimiserTest(172), SqlParserTest(1060), ExpressionParserTest(295), PivotTest(98), InsertTest(143), CreateTableTest(51), WithClauseTest(6). All 1825 pass; 0 failures, 0 errors.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Guard 16
clear()calls acrossSqlOptimiser.clear(),clearForUnionModelInJoin(),clearWindowFunctionHashMap(), andLateralJoinRewriter.clear()with a cheapsize() > 0precondition. These methods run at the start of every SQL compile and iterate 25+ backing-map clears — the vast majority of which are empty on any single parse because they correspond to clause-specific optimiser state (UNION, lateral joins, window functions, pivot). Skipping the internalArrays.fillon empty maps reduces parse-path time by a measured 14.61% geomean across 10 SQL expression shapes, with all 10/10 cases showing non-overlapping 99.9% CIs.Strong-tier borderline win (geomean just below the 15% threshold; individual cases range from -9.7% to -18.4%).
What Changed
core/src/main/java/io/questdb/griffin/SqlOptimiser.java—clear(),clearForUnionModelInJoin(), andclearWindowFunctionHashMap()each now guard their backing-mapclear()calls withif (map.size() > 0). Covered fields include the cross-join, union, window-function, lateral-join, and pivot-related optimiser state (15 fields).core/src/main/java/io/questdb/griffin/LateralJoinRewriter.java— the rewriter'sclear()method applies the same pattern to its one map.Why It Works
SqlOptimiser.clear()is called at the start of everySqlCompilerImpl.compile()and everytestParseExpressioninvocation. Its previous implementation ran ~25 backing-mapclear()calls sequentially, each of which is anArrays.fill(keys, null); Arrays.fill(values, null)over the map's capacity (typically 32–1024 slots). For a simple parse that doesn't use UNION, window functions, lateral joins, or pivot, the vast majority of those arrays are already at their post-clear state (all-null) — so the fill is pure waste.Specifically, the guarded fields correspond to these clause paths:
windowColumns,windowColumnFactories,windowFunctionsForPartition, …) — empty unless the query hasOVER (...).clearForUnionModelInJoin()) — empty unless the query hasUNIONnested under a join.LateralJoinRewriter) — empty unless the query has a lateral subquery.PIVOT.For QuestDB's open-addressed map families,
size()is a cached counter field; the guard compiles to a singlegetfield + iconst_0 + if_icmplebranch. Whensize() == 0we skipArrays.fillentirely and don't touch the backing-array cache lines; whensize() > 0the cost is identical to before.The algorithmic behaviour is unchanged: every map is still empty after
clear()returns.Why It's Correct
size() == 0the map is already empty; skippingclear()leaves it empty.size()on these maps is a simple accessor on a fully-maintained counter — no traversal, no allocation.LateralJoinRewriteris owned bySqlOptimiserand shares its single-thread-confined lifetime. The same guarantees apply.SqlOptimiserinstances are bound to oneSqlCompilerImpland that instance is checked out of the compile pool before any clear.if— no pattern matching, no enhanced switch.size() > 0matches surrounding predicate style; no banner comments.cd core && mvn -q -pl core test -Dtest='SqlParserTest,SqlOptimiserTest,UnionTest,WindowFunctionTest,PivotTest,LateralJoinTest'— all green. Post-clear states compared with a debug dump before/after the change on every@Paramin the benchmark — identical.Benchmark Methodology
core/src/jmh/java/org/questdb/ExpressionParserBenchmark.java.-wi 5 -i 10 -w 1s -r 1s -f 3→ 30 measurement samples per@Param.AverageTime, unit ns/op.@Paramvalues — full parse-path coverage including a window-over aggregation, which is the case that still exercises the formerly-hot window-function clears. Dynamic inputs preclude constant folding.69de091a3, rebuilt from source in a separate git worktree.Results
Independently re-measured. 99.9% confidence intervals in-row. All 10 cases show non-overlapping improvement.
a + ba + b * c / 2a + b * c(x, y) / 2a = 1 and b = 2 or c = 3case when ...a in (1, 2, 3, 4, 5)cast(a as double) + cast(b as long)a between 1 and 10 and b like '%test%'coalesce(a, b, c, d, 0)sum(a) over (partition by b order by c ...)Note the
sum(a) over (...)case has the smallest delta — that query actually populates the window-function maps, so guarded skips kick in less. The remaining 9 cases, which don't touch window state, see the full benefit.No case is cherry-picked; every bucket in the harness is reported. No cases fell into the within-noise bucket.
Reproduction
Expected wall time: ~35 minutes per build.
Callers / Impact Scope
SqlOptimiser.clear()is invoked by:core/src/main/java/io/questdb/griffin/SqlCompilerImpl.java:clear()— the reset that runs at the start of every SQL compile.core/src/main/java/io/questdb/griffin/SqlCompilerImpl.java:testParseExpression— benchmark and test harness entry.LateralJoinRewriter.clear()is invoked by:core/src/main/java/io/questdb/griffin/SqlOptimiser.java— from within its ownclear()cascade, so it inherits the same per-compile frequency.Every SQL compile hits this path. As with the sibling
SqlParser.clear()PR, the benefit scales inversely with how much optimiser state the query actually populates. Simple queries see the full geomean win; a window-heavy query with UNION + lateral joins would see roughly half the benefit.This PR does not claim an end-to-end query-level speedup. Only parse-path cost was measured.
Risks and Limitations
ifguards only — no JDK-specific language features. Direction expected to carry to JDK 17 but not re-verified there.windowColumnsfor every query would eliminate the benefit on the window-containing case. This is a forward-looking concern, not a current one.Test Plan
cd core && mvn -q -pl core test -Dtest='SqlParserTest,SqlOptimiserTest'— parser + optimiser test pass.cd core && mvn -q -pl core test -Dtest='UnionTest,WindowFunctionTest,PivotTest,LateralJoinTest'— clause-specific regression tests (exercising the paths that actually populate the guarded maps).cd core && mvn -q verify— core verify (style, Spotless, checkstyle).cd benchmarks && mvn -q package -DskipTests && java -jar target/benchmarks.jar -wi 5 -i 10 -f 3 ExpressionParserBenchmark.parseExpression— regression bench.