Skip to content

Fix fn:not() predicate crashes on empty context and atomic sequences#6083

Closed
joewiz wants to merge 3 commits intoeXist-db:developfrom
joewiz:worktree-issue-2308-2159-analysis
Closed

Fix fn:not() predicate crashes on empty context and atomic sequences#6083
joewiz wants to merge 3 commits intoeXist-db:developfrom
joewiz:worktree-issue-2308-2159-analysis

Conversation

@joewiz
Copy link
Copy Markdown
Member

@joewiz joewiz commented Mar 3, 2026

Summary

Fixes two long-standing bugs in fn:not() that cause unexpected errors when used in predicates:

Both stem from optimizations in FunNot that assume node-set contexts but encounter empty sets or atomic values. These bugs have been open since 2018.

The FunNot.getDependencies() fix for #2308 also fixes #3289: @*[name() ! contains(., 'DateTime')] throws "the sequence cannot be converted into a node set. Item type is xs:boolean" on persistent (stored) documents. The fix ensures not(.) reports CONTEXT_ITEM dependency inside predicates, which prevents Predicate.recomputeExecutionMode() from pre-evaluating not(.) against the full context sequence.

Approach

Per @wolfgangmm's concern: "eXist handles fn:not in a particular way by trying to evaluate it as a set operation. A naive fix would likely result in many expressions becoming slower by an order of magnitude."

These fixes preserve the set-difference optimization for all performance-critical patterns ([not(child)], [not(@attr)], [not(descendant::x)], [not(self::element)]).

Changes

Fix #2159FunNot.eval() empty-context guard

When fn:not() is used inside a predicate and the context sequence is empty, the set-difference path returned a BooleanValue which Predicate.selectByNodeSet() cannot consume. Fix: return EMPTY_SEQUENCE when in a predicate with empty context (filtering an empty set always yields empty). Outside predicates, the boolean path is preserved. This matches the intent of the original commented-out TODO code.

Fix #2308FunNot.getDependencies() targeted CONTEXT_ITEM

LocationStep.getDependencies() suppresses CONTEXT_ITEM for the self axis inside predicates to enable the set-difference optimization. This is correct for node-set predicates but causes not(.) on atomic sequences to be pre-evaluated against the full sequence instead of item-by-item, throwing FORG0006.

Fix: FunNot.getDependencies() adds CONTEXT_ITEM to the dependency flags when the argument is the context item expression . — specifically, a LocationStep with SELF_AXIS and a node() type test — and the function is inside a predicate. This is targeted and narrow: it only affects . (self::node()), not typed self-axis steps like self::element, forcing Predicate to use per-item boolean evaluation for that specific case. All other predicates (not(child::bar), not(@attr), not(descendant::x), not(self::element)) are unaffected because their arguments use different axes, different type tests, or are not LocationStep instances.

FunNot.returnsType() and LocationStep.getDependencies() are unchanged from develop.

Why the set-difference optimization is preserved

The optimization in FunNot.eval() runs when the argument returns NODE, context is a persistent set, and argument has no CONTEXT_ITEM dependency. After these changes:

Pattern Arg depends on CONTEXT_ITEM? Set-difference? Notes
[not(child::bar)] No Yes Unchanged
[not(@attr)] No Yes Unchanged
[not(descendant::x)] No Yes Unchanged
[not(self::abc)] No (element type test, not node()) Yes Unchanged
[not(.)] on atomics Yes (self::node(), inPredicate) No — boolean Correct, was erroring before
[not(.)] on nodes Yes (self::node(), inPredicate) No — boolean Trivial pattern, not perf-sensitive

The promotion path via Predicate.recomputeExecutionMode() is unchanged.

Performance benchmark

FunNotBenchmark.java measures 5 query patterns (100 warmup + 500 measured iterations each) against a 200-item generated XML document with no indexes (structural index only).

Run with:

mvn test -pl exist-core -Dtest=FunNotBenchmark \
    -Dexist.run.benchmarks=true -Ddependency-check.skip=true

Results

Query develop avg (ms) PR avg (ms) develop ops/s PR ops/s
//item[not(child)] 0.277 0.270 3,609 3,698
//item[not(@attr)] 0.277 0.263 3,608 3,804
//item[not(descendant::x)] 0.266 0.262 3,756 3,821
//item[not(.)] 0.261 0.247 3,838 4,044
//item[not(@id > 100)] 0.724 0.720 1,382 1,388

Takeaway: All queries are within noise margin (~2%). The set-difference optimization is fully preserved for child, attribute, and descendant axis patterns. The not(.) pattern (boolean fallback path) also shows no regression. Environment: macOS, Zulu JDK 21, single-threaded, in-process embedded server.

Test plan

  • New fn-not.xq XQSuite tests covering empty paths, persistent node set-difference, not(.) on integers/strings/booleans/nodes
  • Removed %test:pending from boolseq:countNegativesContextItem in boolean-sequences.xq
  • New simple-map-predicate.xq tests for Type error: eXist misreads an XPath node sequence with a predicate as xs:boolean [BUG] #3289: @*[name() ! contains(., 'DateTime')] on persistent and in-memory documents
  • Performance benchmark (FunNotBenchmark.java): 5 queries × 500 iterations, no regression vs develop
  • Full CoreTests suite: 992 tests, 0 failures, 0 errors
  • OptimizerTest: 6 tests, 0 failures
  • DateTest: 46 tests, 0 failures

🤖 Generated with Claude Code

@joewiz joewiz requested a review from a team as a code owner March 3, 2026 05:31
@joewiz joewiz force-pushed the worktree-issue-2308-2159-analysis branch 3 times, most recently from c996bd5 to e5a0638 Compare March 3, 2026 14:44
@joewiz joewiz marked this pull request as draft March 4, 2026 15:54
@joewiz joewiz force-pushed the worktree-issue-2308-2159-analysis branch 4 times, most recently from bcfa6fd to aced6d7 Compare March 5, 2026 05:47
joewiz and others added 3 commits March 5, 2026 01:41
…Xist-db#2159)

When fn:not() is used inside a predicate (e.g. $doc/*[not(self::abc)])
and the context sequence is empty, the set-difference optimization in
eval() case 1 fell through to evalBoolean(), which returns a BooleanValue.
However, Predicate.selectByNodeSet() expects a node set and throws
"cannot convert xs:boolean('true') to a node set".

The fix returns EMPTY_SEQUENCE when inside a predicate with an empty
context — filtering an empty set always yields an empty set regardless
of the predicate. Outside predicates (e.g. standalone not(())), the
boolean evaluation path is preserved.

Tests cover:
- not(self::x) on empty derived path (the error scenario)
- not(self::x) on non-empty path (correct filtering)
- not(*) on empty path (empty result)
- Standalone not(()) (boolean path unaffected by the fix)
- not(child) on persistent nodes (set-difference optimization works)
- not(@type = 'a') on persistent nodes (general predicate path)

Closes eXist-db#2159

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(true(), false())[not(.)] throws err:FORG0006 instead of returning
(false()). The root cause is that LocationStep.getDependencies()
suppresses CONTEXT_ITEM dependency for the self axis when
inPredicate=true (to enable the set-difference optimization).
This causes Predicate.recomputeExecutionMode() to pre-evaluate
fn:not(.) against the full sequence instead of item-by-item.

The fix adds CONTEXT_ITEM to FunNot.getDependencies() when the
argument is "." (self::node() LocationStep with node() type test)
and the function is inside a predicate. This is targeted and narrow:
it only affects the context item expression ".", not typed self-axis
steps like self::element, preserving the set-difference optimization
for all node-set predicates (not(child), not(@attr), not(descendant::x),
not(self::element)).

Tests cover fn:not(.) on integers, strings, booleans, and nodes.
Includes FunNotBenchmark.java (100 warmup + 500 measured iterations,
5 query patterns on 200-item XML) to verify the set-difference
optimization is preserved with no regression vs develop.

Closes eXist-db#2308

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The expression @*[name() ! contains(., 'DateTime')] on persistent
(stored) documents throws "Type error: the sequence cannot be
converted into a node set. Item type is xs:boolean". This is also
fixed by the FunNot.getDependencies() change in this PR.

Tests cover:
- The exact error pattern from the bug report (persistent doc)
- Workaround patterns that already worked: contains(name(.), ...)
  and @* ! name()[contains(., ...)]
- The same pattern on in-memory nodes (already worked, regression guard)

Closes eXist-db#3289

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@joewiz
Copy link
Copy Markdown
Member Author

joewiz commented Apr 6, 2026

[This comment was co-authored with Claude Code. -Joe]

Closing — superseded by #6207 (v2/xq31-compliance-fixes).

This work has been consolidated into a clean v2/ branch as part of the eXist-db 7.0 PR reorganization. The new PR includes all commits from this PR plus additional related work, with reviewer feedback incorporated where applicable. See the reviewer guide for the full context.

@joewiz joewiz closed this Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

not() function failing on boolean sequence Weird "cannot convert xs:boolean('true') to a node set" message

1 participant