feat: allow indefinite-row functions with selection functions#35185
feat: allow indefinite-row functions with selection functions#35185
Conversation
There was a problem hiding this comment.
Code Review
This pull request enables the coexistence of indefinite-row functions with selection functions (such as first, last, min, and max) by introducing a hasNonSelectAggFuncs flag in SSelectStmt and implementing pass-through column logic in the planner. The changes span node definitions, cloning, JSON serialization, and various stages of the query planner and optimizer. Review feedback highlights several critical issues, including potential memory leaks when list append operations fail and a type mismatch in the optimizer where column nodes are incorrectly added to a list expected to contain only function nodes.
There was a problem hiding this comment.
Pull request overview
This PR updates the SQL parser + planner pipeline to allow indefinite-row functions (e.g., diff, csum, unique, tail, statecount, stateduration, etc.) to coexist with selection aggregate functions (e.g., first/last/min/max/mode/last_row) in the same SELECT, and expands tests to cover the newly-allowed combinations.
Changes:
- Loosen parser restrictions by introducing
hasNonSelectAggFuncsso selection aggs can be mixed with indefinite-row funcs while still blocking non-selection aggregates and multi-row functions. - Update planner/optimizer logic to pass through selection-agg outputs into IndefRows nodes and harden optimizer passes against non-function entries in
pFuncs. - Add/adjust tests so previously-erroring indef+select combinations now execute successfully.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
source/libs/parser/src/parTranslater.c |
Allows indef+selection agg, adds hasNonSelectAggFuncs, and adds rewriting so Agg can output column params used by indef funcs. |
source/libs/planner/src/planLogicCreater.c |
Allows IndefRows logic node creation even when selection aggs are present, and passes through Agg output columns. |
source/libs/planner/src/planOptimizer.c |
Avoids crashes when pFuncs contains non-function nodes (pass-through cols), and attempts to carry these through UNIQUE/Tail rewrites. |
source/libs/planner/src/planPhysiCreater.c |
Improves slot-id error logging for debugging. |
include/libs/nodes/querynodes.h |
Adds hasNonSelectAggFuncs to SSelectStmt. |
source/libs/nodes/src/nodesCloneFuncs.c |
Ensures hasNonSelectAggFuncs is copied during node cloning. |
source/libs/nodes/src/nodesCodeFuncs.c |
Serializes/deserializes hasNonSelectAggFuncs to/from JSON. |
test/cases/11-Functions/04-Timeseries/test_fun_ts_indef_with_select.py |
New test coverage for indef+selection combinations (plus some blocked cases). |
Various existing tests under test/cases/11-Functions/** |
Update expected behavior: indef+selection now allowed in several suites. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
4988c35 to
f128cb9
Compare
f128cb9 to
421c645
Compare
There was a problem hiding this comment.
Pull request overview
Enables indefinite-rows (vector) functions (e.g., diff, csum, unique, tail, statecount, stateduration, etc.) to coexist with selection aggregate functions (e.g., first/last/min/max/mode/last_row) by loosening parser restrictions, plumbing a new hasNonSelectAggFuncs flag through planning, and hardening optimizer rewrites to handle non-function pass-through nodes.
Changes:
- Relax parser validation to allow
indef-rows + select-aggwhile continuing to blockindef-rows + non-select agg(e.g.,count/sum/avg). - Update planner/optimizer to pass through select-agg output columns into IndefRows nodes and avoid crashes when
pFuncscontains non-function nodes. - Add new test coverage (golden-file + negative cases) and update existing tests to reflect the new allowed combinations.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
source/libs/parser/src/parTranslater.c |
Adjusts function co-existence rules; adds rewrite to wrap indef-func column params with _select_value() when a select-agg Agg node sits below IndefRows. |
source/libs/planner/src/planLogicCreater.c |
Allows IndefRows logic node creation when only select-agg funcs are present; adds pass-through columns into IndefRows outputs. |
source/libs/planner/src/planOptimizer.c |
Hardens tail/unique optimizer rewrites to tolerate non-function nodes in pFuncs and preserve pass-through outputs. |
source/libs/planner/src/planPhysiCreater.c |
Improves doSetSlotId error logging with additional column context. |
include/libs/nodes/querynodes.h |
Adds hasNonSelectAggFuncs to SSelectStmt. |
source/libs/nodes/src/nodesCloneFuncs.c |
Copies hasNonSelectAggFuncs during select statement cloning. |
source/libs/nodes/src/nodesCodeFuncs.c |
Serializes/deserializes hasNonSelectAggFuncs in JSON. |
test/cases/11-Functions/04-Timeseries/test_fun_ts_indef_with_select.py |
New comprehensive test for indef-rows + select-agg combinations plus negative cases. |
test/cases/11-Functions/04-Timeseries/in/test_fun_ts_indef_with_select.in |
Input SQL for golden-file comparison. |
test/cases/11-Functions/04-Timeseries/ans/test_fun_ts_indef_with_select.ans |
Expected output for golden-file comparison. |
test/cases/11-Functions/04-Timeseries/test_fun_ts_diff.py |
Updates diff test expectations: diff + min now allowed. |
test/cases/11-Functions/04-Timeseries/test_fun_ts_csum.py |
Updates csum test expectations: csum + min now allowed. |
test/cases/11-Functions/04-Timeseries/test_fun_ts_statecount.py |
Updates statecount expectations: statecount + max now allowed. |
test/cases/11-Functions/04-Timeseries/test_fun_ts_stateduration.py |
Updates stateduration expectations: stateduration + max now allowed. |
test/cases/11-Functions/03-Selection/test_fun_select_unique.py |
Updates unique expectations: unique + max now allowed. |
test/cases/11-Functions/03-Selection/test_fun_select_tail.py |
Updates tail expectations: tail + max now allowed. |
test/cases/11-Functions/01-Scalar/test_fun_sca_stateduration.py |
Updates scalar stateduration expectations: stateduration + max now allowed. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
421c645 to
2c95e1f
Compare
2c95e1f to
31e98ef
Compare
There was a problem hiding this comment.
Pull request overview
This PR relaxes the SQL parser/planner restriction that previously prevented indefinite-row functions (e.g., diff, csum, statecount, stateduration, unique, tail) from being used together with selection aggregate functions (e.g., first, last, min, max, mode, last_row). It introduces a new hasNonSelectAggFuncs flag to preserve the existing prohibition for non-selection aggregates and updates planner/optimizer logic to avoid crashes and correctly pass through required columns. It also adds broad test coverage for allowed/blocked combinations.
Changes:
- Parser: allow selection aggregates to coexist with indefinite-row functions; keep blocking non-selection aggregates and multi-row funcs.
- Planner/optimizer: introduce
hasNonSelectAggFuncs, fix pass-through column handling, and harden optimizer rewrites against non-function nodes inpFuncs. - Tests: update existing negative tests and add a comprehensive
.in/.ans+ runner for indefinite+selection combinations.
Reviewed changes
Copilot reviewed 17 out of 17 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/cases/11-Functions/04-Timeseries/test_fun_ts_stateduration.py | Updates expected errors: stateduration can now mix with selection aggs. |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_statecount.py | Updates expected errors: statecount can now mix with selection aggs. |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_diff.py | Updates expected errors: diff + selection agg now allowed; top still blocked. |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_csum.py | Updates expected errors: csum + selection agg now allowed. |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_indef_with_select.py | New test driver covering many positive/negative combinations. |
| test/cases/11-Functions/04-Timeseries/in/test_fun_ts_indef_with_select.in | New SQL input for result-comparison testing. |
| test/cases/11-Functions/04-Timeseries/ans/test_fun_ts_indef_with_select.ans | New golden output for result-comparison testing. |
| test/cases/11-Functions/03-Selection/test_fun_select_unique.py | Updates expected errors: unique can now mix with selection aggs. |
| test/cases/11-Functions/03-Selection/test_fun_select_tail.py | Updates expected errors: tail can now mix with selection aggs. |
| test/cases/11-Functions/01-Scalar/test_fun_sca_stateduration.py | Updates expected errors: scalar stateduration + selection aggs now allowed. |
| source/libs/parser/src/parTranslater.c | Implements the new mixing rules and adds rewrite pass to wrap needed inputs with _select_value(). |
| source/libs/planner/src/planLogicCreater.c | Allows IndefRows node creation with selection aggs and collects pass-through columns. |
| source/libs/planner/src/planOptimizer.c | Hardens tail/unique rewrites for non-function nodes; adds pass-through handling. |
| source/libs/planner/src/planPhysiCreater.c | Improves slot-not-found debug logging for column nodes. |
| source/libs/nodes/src/nodesCodeFuncs.c | JSON (de)serialization for hasNonSelectAggFuncs. |
| source/libs/nodes/src/nodesCloneFuncs.c | Clone support for hasNonSelectAggFuncs. |
| include/libs/nodes/querynodes.h | Adds hasNonSelectAggFuncs to SSelectStmt. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Remove parser restriction blocking indef-row funcs (diff, csum, etc.) from coexisting with selection funcs (first, last, min, max, etc.). Add hasNonSelectAggFuncs flag, fix planner pass-through columns, add nodeType safety checks and memory leak fixes in optimizer. Comprehensive .in/.ans test coverage with extended scenarios.
31e98ef to
8e5954f
Compare
facetosea
left a comment
There was a problem hiding this comment.
Re: #12 (pAggFuncs function-only concern in rewriteUniqueOptCreateAgg)
Fixed. Pass-through column nodes are now wrapped with _select_value() via new helper rewriteUniqueOptCreateSelectValueFunc() instead of being added as raw COLUMN nodes. This ensures pAgg->pAggFuncs always contains only SFunctionNode entries, safe for all downstream walkers (partTagsOpt, TSMA rewrite, splitCacheLast, etc.).
Commit: 8e5954f
There was a problem hiding this comment.
Pull request overview
This PR updates the SQL parser and planner to allow indefinite-row functions (e.g. diff, csum, statecount, stateduration, fill_forward, etc.) to coexist with selection aggregate functions (e.g. first, last, min, max, mode, last_row) while still blocking incompatible aggregate/multi-row combinations. It also adds targeted test coverage for the newly-allowed combinations.
Changes:
- Parser: relax “agg mixed with indef/multi-row” restriction to permit select-only agg funcs with indefinite-row funcs; introduce
hasNonSelectAggFuncsfor correct gating. - Planner/optimizer: adjust IndefRows creation to pass through Agg output columns; harden optimizer rewrites against non-function nodes in
pFuncs; extend SelectStmt clone/JSON support for the new flag. - Tests: update existing negative tests to reflect the new allowance and add a new dedicated
.in/.ans+ python test covering many combinations and edge cases.
Reviewed changes
Copilot reviewed 18 out of 18 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| test/cases/11-Functions/04-Timeseries/test_fun_ts_stateduration.py | Updates expected error/query behavior for stateduration mixed with select agg funcs. |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_statecount.py | Updates expected error/query behavior for statecount mixed with select agg funcs. |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_mavg.py | Removes prior “select func should error” expectation for mavg + min. |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_diff.py | Makes diff + min a positive case; keeps multi-row select funcs (e.g. top) blocked. |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_csum.py | Makes csum + min a positive case. |
| test/cases/11-Functions/04-Timeseries/test_fun_ts_indef_with_select.py | New test class covering allowed/blocked combinations and some shape checks. |
| test/cases/11-Functions/04-Timeseries/in/test_fun_ts_indef_with_select.in | New SQL input for compare-based test coverage of allowed combinations. |
| test/cases/11-Functions/04-Timeseries/ans/test_fun_ts_indef_with_select.ans | Expected outputs for the new compare-based test. |
| test/cases/11-Functions/03-Selection/test_fun_select_unique.py | Updates expected error/query behavior for unique mixed with select agg funcs. |
| test/cases/11-Functions/03-Selection/test_fun_select_tail.py | Updates expected error/query behavior for tail mixed with select agg funcs. |
| test/cases/11-Functions/01-Scalar/test_fun_sca_stateduration.py | Updates expected error/query behavior for scalar stateduration with select agg funcs. |
| source/libs/planner/src/planPhysiCreater.c | Extends slot-id error logging with column debug details. |
| source/libs/planner/src/planOptimizer.c | Avoids optimizer crashes with non-function nodes in pFuncs; adds _select_value wrapping in unique rewrite paths. |
| source/libs/planner/src/planLogicCreater.c | Allows IndefRows node creation with select agg funcs; adds pass-through Agg output columns into IndefRows. |
| source/libs/parser/src/parTranslater.c | Adds hasNonSelectAggFuncs; relaxes/updates function coexistence rules; rewrites indef params for select-agg coexistence. |
| source/libs/nodes/src/nodesCodeFuncs.c | Adds JSON serialization/deserialization for hasNonSelectAggFuncs. |
| source/libs/nodes/src/nodesCloneFuncs.c | Adds cloning support for hasNonSelectAggFuncs. |
| include/libs/nodes/querynodes.h | Adds hasNonSelectAggFuncs field to SSelectStmt. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (TSDB_CODE_SUCCESS != code) { | ||
| nodesDestroyNode(pNew); | ||
| } | ||
| } | ||
| } else { | ||
| code = rewriteTailOptCreateProjectExpr((SFunctionNode*)pFunc, &pNew); | ||
| if (TSDB_CODE_SUCCESS == code) { | ||
| code = nodesListMakeStrictAppend(&pProject->pProjections, pNew); | ||
| if (TSDB_CODE_SUCCESS != code) { | ||
| nodesDestroyNode(pNew); | ||
| } |
| if (TSDB_CODE_SUCCESS != code) { | ||
| nodesDestroyNode(pNew); | ||
| } | ||
| } | ||
| } else { | ||
| code = rewriteTailOptCreateProjectExpr((SFunctionNode*)pFunc, &pNew); | ||
| if (TSDB_CODE_SUCCESS == code) { | ||
| code = nodesListMakeStrictAppend(&pProject->pProjections, pNew); | ||
| if (TSDB_CODE_SUCCESS != code) { | ||
| nodesDestroyNode(pNew); | ||
| } |
| if (TSDB_CODE_SUCCESS != code) { | ||
| nodesDestroyNode(pNew); | ||
| } |
| if (TSDB_CODE_SUCCESS != code) { | ||
| nodesDestroyNode(pNew); | ||
| } |
| if (TSDB_CODE_SUCCESS != code) break; | ||
| code = nodesListMakeStrictAppend(&pPassThroughCols, pClone); | ||
| if (TSDB_CODE_SUCCESS != code) { | ||
| nodesDestroyNode(pClone); |
| if (TSDB_CODE_SUCCESS != code) break; | ||
| code = nodesListMakeStrictAppend(&pIdfRowsFunc->pFuncs, pClone); | ||
| if (TSDB_CODE_SUCCESS != code) { | ||
| nodesDestroyNode(pClone); |
Add hasMultiSelectAggFuncs/selectAggFuncNum to SSelectStmt to track when more than one selection-agg function (first, last, min, max, etc.) coexists with indefinite-row functions. This combination is ambiguous and now returns an error at parse time. Update .in/.ans tests and add negative test cases.
2a45d1d to
b6572e8
Compare
Remove parser restriction blocking indef-row funcs (diff, csum, etc.) from coexisting with selection funcs (first, last, min, max, etc.). Add hasNonSelectAggFuncs flag, fix planner pass-through columns, fix optimizer crashes with non-function nodes in pFuncs. Add comprehensive test coverage for all combinations.
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.