Skip to content

[JEWEL-1148] Fix BasicLazyTree Memory Leak#3468

Open
DanielSouzaBertoldi wants to merge 1 commit intoJetBrains:masterfrom
DanielSouzaBertoldi:dsb/JEWEL-1148
Open

[JEWEL-1148] Fix BasicLazyTree Memory Leak#3468
DanielSouzaBertoldi wants to merge 1 commit intoJetBrains:masterfrom
DanielSouzaBertoldi:dsb/JEWEL-1148

Conversation

@DanielSouzaBertoldi
Copy link
Copy Markdown
Collaborator

@DanielSouzaBertoldi DanielSouzaBertoldi commented Mar 23, 2026

Context

We received a report of severe OutOfMemory (OOM) issues and CPU bottlenecks when using LazyTree with a dynamically recalculating tree (e.g., updating based on frequent background processes). Profiling revealed that after just 1,000 process spawns, the tree was retaining over 230k references to old nodes.

The workaround in the user's scenario was to use the tree itself as a key to remember so that it forced a recomposition and cleared the references to old nodes.

No need to say, this smells heavily like a memory leak in the LazyTree, which demands a proper investigation.

Pick up your magnifying glass, choose the most appropriate trench coat and let's dissect this 😎

Investigation

Note

TL;DR: Fixed severe OOM issues by using item.data::class for contentType to properly recycle Compose nodes and purging stale IDs from TreeState upon rebuilds. We also eliminated hidden list allocations during tree traversal, dropping the retained heap by over ~70% and smoothing out UI stutter.

Findings

The bug report had a reproducible code we could use for profiling, so let's use it.

After adding about 7k entries to the table:

image

Dumping the heap (by running jcmd <PID> GC.heap_dump test-dump.hprof) and using Eclipse's Memory Analyzer tool, in the histogram tab we could see that both org.jetbrains.jewel.samples.showcase.components.TestTreeNode and org.jetbrains.jewel.foundation.lazy.tree.Tree$Element$Node had a really big retained heap size:

Class Name Objects Shallow Heap Retained Heap (bytes)
TestTreeNode 14,336 344,064 >=1,088,904
Tree$Element$Node 7,168 344,064 >=1,194,256
If you'd rather see a screenshot, click here! image

Which is honestly insane. A little over 1MB in retained objects for just ~7k lines of text. Something is really wrong here. Let's go a little bit deeper and check the path to GC roots to some of the Tree$Element$Node objects (btw, some of these objects retain only 88 bytes and others 168 bytes in the heap. The thing though, is that there are 7,168 objects 😬).

Alright, let's take a closer look at what each object is retaining and why GC is not cleaning them up:

image

There's a lot of things here but my main point is: all classes that retain most of the heap (all selected rows besides the first three) are from Compose itself, which is holding our data hostage:

  1. RootNodeOwner$OwnerImpl (The root of the Compose UI tree)
  2. ...holds onto a LayoutNode (A generic Compose UI element)
  3. ...which captures a Lambda (BasicLazyTreeKt$$Lambda...)
  4. ...which references Tree$Element$Node.

We can rule out any wrongdoing on Compose's side, since this is just its inner workings. It's the LazyColumn item reuse pool in action. If you don't aren't familiar, when you scroll an item off-screen, or when you rebuild the tree with new items, Compose doesn't want to destroy the expensive LayoutNodes just to recreate them a millisecond later. Instead, it detaches them and puts them in a cache (the reuse pool) so they can be recycled for the next item that scrolls into view.

The main problem though, is that the Compose is being unable to reuse LayoutNodes. It should NOT have a pool of 7,168 entries at all. This means that each and every entry in the list is being treated as a completely unique UI layout.

Also, there's something shady going on with our org.jetbrains.jewel.foundation.lazy.SelectableLazyColumnKt$SelectableLazyColumn$notifyingPointerEventActions class. It's holding about 63kB in the heap, which isn't a lot but still, it shouldn't be doing that. Maybe they have the same root cause (foreshadowing intensifies).

Finding the Culprit

Given our findings, the only way for Compose to be treating all entries as unique layouts is due to a sneaky line of code:

// BasicLazyTree.kt L206

itemsIndexed(
    items = flattenedTree,
    key = { _, item -> item.id },
    contentType = { _, item -> item.data }, // <--- this guy here!
)

Given that each data has a unique test in this reproducible (e.g, "bar , foo foofoo"), every single entry has a 100% unique content type. Because the content types are unique, LazyColumn creates a brand new reuse bucket for every single item that is ever removed. It caches exactly 1 node per bucket, and since there are hundreds of thousands of unique buckets, it holds onto every single old Tree.Element forever 😬

The Fix

By simply just using the data class as the content type, we get rid of the OOM. This way, Compose can cap the reuse pools to exactly two buckets (Tree.Element.Node::class and Tree.Element.Leaf::class, the only two types of Tree.Element<T> our Tree.kt has). Now Compose will only ever recycle an old Node layout for a new Node, and an old Leaf layout for a new Leaf.

Let's fix the problematic line, add the same amount of rows to the tree and check the dump once again:

Scenario Screenshot
Histogram image
Shortest Path to GC image

And a table for comparison:

Class Name Retained Heap (before) Retained Heap (after) Decrease percentage
org.jetbrains.jewel.foundation.lazy.tree.Tree$Element$Node ~1,088,904 ~344,272 ~68%
org.jetbrains.jewel.samples.showcase.components.TestTreeNode ~1,194,256 ~753,824 ~37%

This by itself is a HUGE improvement over what we had before. This is the main fix for this PR.

Also, note that the path doesn't reference BasicLazyTree anymore: since the UI side is no longer leaking, the shortest path to GC defaults to the State side of Compose (the SlotTable).

Instead of seeing the click listener lambda, you can now see SelectableLazyListScopeKt$$Lambda. This is the lambda generated by the itemsIndexed block inside the SelectableLazyListScope DSL. The SlotTable keeps this lambda around because it needs to know how to build the list structure, but it only keeps the active state, not thousands of dead copies.

However, you might be thinking "well, a 37% decrease in retained heap for TestTreeNode (our test Composable) is still not ideal" and I totally agree with you! But given that this is not the main issue for this PR, I'll add what we can do to decrease the retained heap for this case even further, just check the collapsed section below!

Optimizing the code even further

Take a look at the retained heap for the selected Objects in the last screenshot. You can see that the second row has a retained heap of 101,024 bytes and 2896 instances. The first row has a retained heap of 4,952 and 954 instances. This is a good indication that we can improve things further.

Extra issue 1 (flattenTree):

There were actually two separate issues in the flattenTree code:

1. The Stale ID Leak (Memory)
In BasicLazyTree, TreeState (allNodes and openNodes) would accumulate IDs from older, discarded trees indefinitely. When users provide their own objects as node IDs (via addNode(data, id = myObject)), this caused those objects to be kept alive in memory long after the tree was replaced, leading to unbounded memory growth.

We fixed this with a per-flattening cleanup: allNodes is cleared before each traversal and repopulated only with nodes still present in the active tree. Afterward, openNodes is intersected with the surviving IDs, dropping stale references while preserving expansion state for nodes that still exist.

2. The .map Bottleneck (CPU/GC Churn)

Take a look at this sneaky line inside flattenTree:

if (id !in state.allNodes.map { it.first })

Because this runs for every single node during flattening, calling .map creates a brand new temporary ArrayList in memory every single iteration. For 7,000 nodes, we were allocating 7,000 lists, causing massive GC churn and possibly UI frame drops.

We fixed this by changing it to .none { it.first == id }, which performs the exact same check with zero allocations.

Extra issue 2:

private infix fun MutableSet<Any>.getAllSubNodes(node: Tree.Element.Node<*>) {
    node.children?.filterIsInstance<Tree.Element.Node<*>>()?.forEach {
         add(it.id)
        this@getAllSubNodes getAllSubNodes (it) 
    }
}

The problem with this function is that filterIsInstance always creates a brand new ArrayList under the hood to hold the filtered results. Because this function is recursive, we are allocating a new, temporary list in memory for every single node in the tree that has children. If you close a node with 1,000 descendants, you just created 1,000 temporary lists that immediately need to be garbage collected.

The fix to eliminate these allocations entirely is just use a standard forEach and check the type inside the loop. This makes the function zero-allocation.

Result

Scenario Screenshot
Histogram image
Shortest Path to GC image

Yeah, that's some pretty awesome stuff. The second selected Object when down to only 80 bytes in the heap and 12 instances, while the first row kept the retained heap of 4,952 bytes but is down to only 40 instances 🤯 With this, here's the new result for TestTreeNode:

Class Name Retained Heap (before) Retained Heap (after) Decrease percentage
org.jetbrains.jewel.samples.showcase.components.TestTreeNode ~1,194,256 ~318,800 ~73%!!!!

Release notes

Bug fixes

  • LazyTree does not retain references to each and every single entry in the list anymore, properly reusing the UI and not eating a insane amount of heap in the process, preventing OOM crashes and CPU bottlenecks.

@@ -178,8 +179,19 @@ public fun <T> BasicLazyTree(
) {
val scope = rememberCoroutineScope()

val flattenedTree =
remember(tree, treeState.openNodes, treeState.allNodes) { tree.roots.flatMap { it.flattenTree(treeState) } }
val flattenedTree = remember(tree, treeState.openNodes) {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Read the final collapsed section in the PR description. This is part of the Extra Issue 1.

Also, we don't need treeState.allNodes keyed here because it was a self-defeating loop:

openNodes changes → remember reruns → flattenTree adds new IDs to allNodes → allNodes changes → remember reruns again → flattenTree finds nothing new to add → stable

itemsIndexed(
items = flattenedTree,
key = { _, item -> item.id },
contentType = { _, item -> item.data?.let { it::class } },
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The actual "real" fix

@@ -422,8 +436,10 @@ private fun Tree.Element<*>.flattenTree(state: TreeState): MutableList<Tree.Elem
}

private infix fun MutableSet<Any>.getAllSubNodes(node: Tree.Element.Node<*>) {
node.children?.filterIsInstance<Tree.Element.Node<*>>()?.forEach {
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using filterIsInstance here is part of the "Extra issue 2" described in the PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant