feat(parser): [6659773700] relax window query SELECT list restrictions#35226
feat(parser): [6659773700] relax window query SELECT list restrictions#35226
Conversation
Allow raw columns, tags, tbname, scalar expressions and scalar functions in window queries when no aggregate functions present. - doCheckExprForGroupBy: skip columns and scan-pseudo-cols - translateSelectFrom: set hasIndefiniteRowsFunc for projection mode - HAVING/ORDER BY: support tag/pseudo-col NOT in SELECT list - child table: use _group_key for tag columns without PARTITION BY
Detect projection mode when nodesCollectFuncs returns only pseudo-cols. Clone pProjectionList into pWindow->pProjs, merge pFuncs, and collect HAVING columns into target list. - planLogicCreater: projection detection, pProjs build, HAVING support - planSpliter: fix multi-vgroup PARTITION BY + INTERVAL in projection
There was a problem hiding this comment.
Code Review
This pull request implements raw-column and scalar expression projections for window queries, refactoring the executor to support direct evaluation of these expressions and updating the planner to handle projection-only modes and subquery pruning. The review feedback identifies a critical logic error where unaliased functions and expressions are incorrectly treated as degenerate placeholders, potentially causing evaluation failures. Additionally, several instances of potential memory leaks were found in the planner where cloned nodes or lists are not properly cleaned up during error handling or early returns.
| bool hasNamedExpr = false; | ||
| SNode* pTmp = NULL; | ||
| FOREACH(pTmp, pSelect->pProjectionList) { | ||
| if (nodeType(pTmp) == QUERY_NODE_COLUMN || ((SExprNode*)pTmp)->aliasName[0] != '\0') { |
There was a problem hiding this comment.
The check for hasNamedExpr is too restrictive. It currently only considers raw columns or aliased expressions as "named". This will cause unaliased scalar functions (e.g., SELECT sin(current) FROM ...) or scalar expressions (e.g., SELECT current + 1 FROM ...) to be incorrectly treated as degenerate placeholders, leading to them being replaced by a minimal projection (primary key) and causing subsequent evaluation failures in parent nodes.
if (nodeType(pTmp) == QUERY_NODE_COLUMN || nodeType(pTmp) == QUERY_NODE_FUNCTION || nodeType(pTmp) == QUERY_NODE_EXPRESSION || ((SExprNode*)pTmp)->aliasName[0] != '\0') {| PLAN_ERR_JRET(nodesCloneNode(pFirstTarget, &pClone)); | ||
| PLAN_ERR_JRET(nodesListMakeStrictAppend(&pWindow->pProjs, pClone)); |
There was a problem hiding this comment.
Potential memory leak: if nodesListMakeStrictAppend fails, the cloned node pClone is not destroyed before returning.
PLAN_ERR_JRET(nodesCloneNode(pFirstTarget, &pClone));
int32_t code = nodesListMakeStrictAppend(&pWindow->pProjs, pClone);
if (code != TSDB_CODE_SUCCESS) {
nodesDestroyNode(pClone);
return code;
}| PLAN_ERR_JRET(nodesCloneNode(pFuncNode, &pClone)); | ||
| PLAN_ERR_JRET(nodesListMakeStrictAppend(&pWindow->pProjs, pClone)); |
There was a problem hiding this comment.
Potential memory leak: if nodesListMakeStrictAppend fails, the cloned node pClone is not destroyed before returning.
PLAN_ERR_JRET(nodesCloneNode(pFuncNode, &pClone));
int32_t code = nodesListMakeStrictAppend(&pWindow->pProjs, pClone);
if (code != TSDB_CODE_SUCCESS) {
nodesDestroyNode(pClone);
return code;
}| PLAN_ERR_JRET(nodesCollectColumnsFromNode(pSelect->pHaving, NULL, COLLECT_COL_TYPE_ALL, &pHavingCols)); | ||
| SNode* pColNode = NULL; | ||
| FOREACH(pColNode, pHavingCols) { | ||
| bool found = false; | ||
| SNode* pProjNode = NULL; | ||
| FOREACH(pProjNode, pWindow->pProjs) { | ||
| if (nodesEqualNode(pProjNode, pColNode)) { | ||
| found = true; | ||
| break; | ||
| } | ||
| } | ||
| if (!found) { | ||
| SNode* pClone = NULL; | ||
| PLAN_ERR_JRET(nodesCloneNode(pColNode, &pClone)); | ||
| PLAN_ERR_JRET(nodesListMakeStrictAppend(&pWindow->pProjs, pClone)); | ||
| } | ||
| } | ||
| nodesDestroyList(pHavingCols); |
| PLAN_ERR_JRET(nodesCloneNode(pColNode, &pClone)); | ||
| PLAN_ERR_JRET(nodesListMakeStrictAppend(&pWindow->pProjs, pClone)); |
There was a problem hiding this comment.
Potential memory leak: if nodesListMakeStrictAppend fails, the cloned node pClone is not destroyed before returning.
PLAN_ERR_JRET(nodesCloneNode(pColNode, &pClone));
int32_t code = nodesListMakeStrictAppend(&pWindow->pProjs, pClone);
if (code != TSDB_CODE_SUCCESS) {
nodesDestroyNode(pClone);
nodesDestroyList(pHavingCols);
return code;
}There was a problem hiding this comment.
Pull request overview
This PR relaxes the parser/planner/executor restrictions for window queries so that window queries without aggregate functions can project raw columns/tags/tbname/scalar expressions, and expands HAVING/ORDER BY handling to allow tag/pseudo-column references outside the SELECT list.
Changes:
- Add “projection-only” window-query planning/execution paths (parser flags, planner window targets/projections, executor indef-rows runtime support).
- Adjust stable window split behavior for projection-only interval/external windows.
- Add new CI pytest entry and a new window-projection test suite with in/ans cases.
Reviewed changes
Copilot reviewed 17 out of 20 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test/ci/cases.task | Registers the new window projection pytest in CI task list |
| test/cases/query/test_window_projection.py | Adds new pytest suite covering projection-mode window queries (and regressions) |
| test/cases/query/in/test_window_projection_small.in | Adds small in/ans query set for projection-mode window queries |
| test/cases/query/in/test_window_projection_large.in | Adds large in/ans query set (expected-output file appears missing in this PR) |
| test/cases/query/in/test_window_projection_edge.in | Adds edge-case in/ans query set |
| test/cases/query/in/test_window_projection_largedata.in | Adds cross-block in/ans query set (expected-output file appears missing in this PR) |
| test/cases/query/in/test_window_projection_partition.in | Adds partition-focused in/ans query set (expected-output file appears missing in this PR) |
| test/cases/query/ans/test_window_projection_small.ans | Adds expected output for small in/ans set |
| test/cases/query/ans/test_window_projection_edge.ans | Adds expected output for edge-case in/ans set |
| source/libs/parser/src/parTranslater.c | Relaxes GROUP BY rewrite behavior for window projection mode; sets projection-mode flags and fill restrictions |
| source/libs/planner/src/planLogicCreater.c | Builds projection lists/targets for projection-mode windows; merges HAVING/ORDER BY needs into projections |
| source/libs/planner/src/planSpliter.c | Adjusts stable window split strategy for projection-mode interval/external windows |
| source/libs/executor/src/executorInt.c | Extends indef-rows runtime to support projection expressions; routes pseudo-col filling and projection evaluation |
| source/libs/executor/inc/executorInt.h | Updates indef-rows runtime API and runtime struct to include projection support |
| source/libs/executor/src/timewindowoperator.c | Passes projection list + function store into indef-rows runtime init |
| source/libs/executor/src/eventwindowoperator.c | Passes projection list + function store into indef-rows runtime init |
| source/libs/executor/src/countwindowoperator.c | Passes projection list + function store into indef-rows runtime init |
Comments suppressed due to low confidence (1)
source/libs/executor/src/executorInt.c:1343
- In initIndefRowsRuntime() error path, pRuntime->pPseudoColInfo is allocated by setRowTsColumnOutputInfo() but never freed before returning. This can leak memory when init fails; free/destroy pPseudoColInfo (and null it) in the _return cleanup block.
_return:
qError("%s failed at line %d since %s", __func__, lino, tstrerror(code));
cleanupExprSupp(&pRuntime->projSupp);
pRuntime->pReadyBlocks = tdListFree(pRuntime->pReadyBlocks);
tSimpleHashCleanup(pRuntime->pOpenStatesMap);
pRuntime->pOpenStatesMap = NULL;
return code;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| sqlFile = os.path.join( | ||
| os.path.dirname(__file__), "in", | ||
| "test_window_projection_partition.in" | ||
| ) | ||
| ansFile = os.path.join( | ||
| os.path.dirname(__file__), "ans", | ||
| "test_window_projection_partition.ans" | ||
| ) | ||
| tdCom.compare_testcase_result( | ||
| sqlFile, ansFile, "test_window_projection_partition" | ||
| ) |
| select _wstart, ts, csum(current) from d1001 interval(3s) having _wstart >= '2024-01-01 00:00:09'; | ||
| select _wstart, ts, csum(current) from d1001 interval(3s) having current > 1.5; | ||
| select _wstart, ts, csum(current) from meters partition by tbname interval(3s) having location = 'Beijing' order by _wstart, ts; | ||
| select _wstart, ts, csum(current) from meters partition by tbname interval(3s) having location = 'Beijing' order by _wstart, ts; |
| select _wstart, ts, current from d1001 where current > 1.0 interval(3s) having current > 1.5; | ||
| select _wstart, ts, current from d1001 where current > 1.0 interval(3s) order by _wstart limit 5; |
| sqlFile = os.path.join( | ||
| os.path.dirname(__file__), "in", | ||
| "test_window_projection_large.in" | ||
| ) | ||
| ansFile = os.path.join( | ||
| os.path.dirname(__file__), "ans", | ||
| "test_window_projection_large.ans" | ||
| ) | ||
| tdCom.compare_testcase_result( | ||
| sqlFile, ansFile, "test_window_projection_large" | ||
| ) |
| sqlFile = os.path.join( | ||
| os.path.dirname(__file__), "in", | ||
| "test_window_projection_largedata.in" | ||
| ) | ||
| ansFile = os.path.join( | ||
| os.path.dirname(__file__), "ans", | ||
| "test_window_projection_largedata.ans" | ||
| ) | ||
| tdCom.compare_testcase_result( | ||
| sqlFile, ansFile, "test_window_projection_largedata" | ||
| ) |
Add SExprSupp projSupp to window operator runtime. When pProjs is present, use projectApplyFunctionsWithSelect for non-aggregate output. - executorInt.h/c: add projSupp init/destroy in initIndefRowsRuntime - timewindow/eventwindow/countwindow operators: pass projSupp
5 test suites with full data comparison (no count-only checks): - small: 82 queries on 4 tables (10 rows each) - large: 52 queries on 3 tables (5000 rows each), 214K-line ans - largedata: 7 queries on 5 tables (6000 rows each), 66K-line ans - partition: 3 queries on 20 tables (100 rows each) - edge: 5 self-contained edge case queries Plus 12 error-case tests in pytest.
c34defc to
35d7bbd
Compare
There was a problem hiding this comment.
Pull request overview
This PR relaxes window-query SELECT list restrictions so that window queries without aggregate functions can project raw columns/tags/tbname and scalar expressions, and extends HAVING/ORDER BY to reference tags/pseudo-columns not present in the SELECT list.
Changes:
- Parser/planner updates to treat non-aggregate window queries as “projection mode” and to rewrite/slot-bind tag & pseudo-column references used by HAVING/ORDER BY.
- Executor updates to support projection-mode indefinite-rows window execution by initializing a separate projection expression support (
projSupp) and evaluating projections viaprojectApplyFunctionsWithSelect. - Adds a new CI test suite with both pytest negative cases and in/ans result comparisons.
Reviewed changes
Copilot reviewed 17 out of 20 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/ci/cases.task | Adds the new window-projection test suite to CI task list. |
| test/cases/query/test_window_projection.py | New pytest coverage for projection-mode window queries (negative + in/ans comparisons). |
| test/cases/query/in/test_window_projection_*.in | New SQL inputs covering small/large/edge/partition scenarios. |
| test/cases/query/ans/test_window_projection_*.ans | Expected outputs for the in/ans comparisons. |
| source/libs/parser/src/parTranslater.c | Skips group-by rewriting for columns/pseudo-cols in projection-mode window queries; sets hasIndefiniteRowsFunc for projection mode; adds fill-mode restriction. |
| source/libs/planner/src/planLogicCreater.c | Adds projection-mode handling to build pProjs and ensure HAVING/ORDER BY columns have slots. |
| source/libs/planner/src/planSpliter.c | Adjusts stable window split behavior for interval/external windows in projection mode (pFuncs==NULL && pProjs!=NULL). |
| source/libs/executor/src/executorInt.c | Extends indefinite-rows runtime to optionally initialize projection expr support and to fill pseudo-cols for it. |
| source/libs/executor/inc/executorInt.h | Updates SIndefRowsRuntime and initIndefRowsRuntime signature. |
| source/libs/executor/src/timewindowoperator.c | Passes projection list and functionStore into initIndefRowsRuntime. |
| source/libs/executor/src/eventwindowoperator.c | Passes projection list and functionStore into initIndefRowsRuntime. |
| source/libs/executor/src/countwindowoperator.c | Passes projection list and functionStore into initIndefRowsRuntime. |
Comments suppressed due to low confidence (1)
source/libs/executor/src/executorInt.c:1343
- On the
_returnerror path,pRuntime->pPseudoColInfo(allocated bysetRowTsColumnOutputInfo) is not destroyed. If initialization fails after that call (e.g.,createExprInfo/initExprSuppfails), this leaks. Please addtaosArrayDestroy(pRuntime->pPseudoColInfo)(and null it) in the error cleanup path.
_return:
qError("%s failed at line %d since %s", __func__, lino, tstrerror(code));
cleanupExprSupp(&pRuntime->projSupp);
pRuntime->pReadyBlocks = tdListFree(pRuntime->pReadyBlocks);
tSimpleHashCleanup(pRuntime->pOpenStatesMap);
pRuntime->pOpenStatesMap = NULL;
return code;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (pSelect->pHaving != NULL) { | ||
| SNodeList* pHavingCols = NULL; | ||
| PLAN_ERR_JRET(nodesCollectColumnsFromNode(pSelect->pHaving, NULL, COLLECT_COL_TYPE_ALL, &pHavingCols)); | ||
| SNode* pColNode = NULL; | ||
| FOREACH(pColNode, pHavingCols) { | ||
| bool found = false; | ||
| SNode* pProjNode = NULL; | ||
| FOREACH(pProjNode, pWindow->pProjs) { | ||
| if (nodesEqualNode(pProjNode, pColNode)) { | ||
| found = true; | ||
| break; | ||
| } | ||
| } | ||
| if (!found) { | ||
| SNode* pClone = NULL; | ||
| PLAN_ERR_JRET(nodesCloneNode(pColNode, &pClone)); | ||
| PLAN_ERR_JRET(nodesListMakeStrictAppend(&pWindow->pProjs, pClone)); | ||
| } | ||
| } | ||
| nodesDestroyList(pHavingCols); | ||
| } |
| select _wstart, ts, csum(current) from d1001 interval(3s) having _wstart >= '2024-01-01 00:00:09'; | ||
| select _wstart, ts, csum(current) from d1001 interval(3s) having current > 1.5; | ||
| select _wstart, ts, csum(current) from meters partition by tbname interval(3s) having location = 'Beijing' order by _wstart, ts; | ||
| select _wstart, ts, csum(current) from meters partition by tbname interval(3s) having location = 'Beijing' order by _wstart, ts; |
Allow raw columns, tags, tbname, scalar expressions and scalar
functions in window queries when no aggregate functions present.
Issue(s)
Checklist
Please check the items in the checklist if applicable.