Skip to content

Use Codex patricia-tree library for base map domain#2002

Draft
sim642 wants to merge 11 commits intomasterfrom
patricia-tree
Draft

Use Codex patricia-tree library for base map domain#2002
sim642 wants to merge 11 commits intomasterfrom
patricia-tree

Conversation

@sim642
Copy link
Copy Markdown
Member

@sim642 sim642 commented Apr 15, 2026

Closes #1967.

During the AnalyzeThat 2026 workshop the Codex team told us that they have specifically published on opam the patricia-tree library which provides efficient merging with idempotent operators. This is an attempt to use it for the map domain in the base analysis.
I noticed that some part of Frama-C is also possibly switching to this library: https://git.frama-c.com/pub/frama-c/-/merge_requests/18.

Note that the library doesn't completely replace Stdlib.Map.Make: patricia-tree requires keys to have unique int values.
Luckily, base analysis map domain has varinfos which have vid, so it should work.
But we have plenty of other usages of MapDomain with more complex keys, so this PR tries to generalize MapDomain to allow both to be used.

Apparently this causes one test with int domain refinement and narrowing to fail again because the narrow is not idempotent.

TODO

  • rsync benchmark.
  • sv-benchmarks.
  • Look through code TODOs.
  • Figure out what to do with int domain refinement narrowing test. Using non-idempotent merging with meet and narrow due to possible int domain refinement issues for now.

@sim642 sim642 added this to the v2.8.0 Clumsy Clurichaun milestone Apr 15, 2026
@sim642 sim642 self-assigned this Apr 15, 2026
@sim642 sim642 added bug performance Analysis time, memory usage analyze-that labels Apr 15, 2026
@sim642
Copy link
Copy Markdown
Member Author

sim642 commented Apr 16, 2026

For flow-sensitive analysis of rsync, this reduces the analysis time from ~9h 23min to ~7h 17min, which is about -22% (goblint/bench@7f22f21).

Around 20% of the time is still spent on GC. A lot of time is still spent in the inner join, etc. operations which can be optimized separately. My guess is that they haven't been written to carefully preserve physical equality but rather allocate new copies for identical arguments. The patricia-tree library exploits physical equality extensively for efficiency.

I'm now also doing a run on sv-benchmarks although I don't expect much change there.

@michael-schwarz
Copy link
Copy Markdown
Member

michael-schwarz commented Apr 17, 2026

I was confused about what Codex had to do with it, but then realized it's the Lemerre et al. Codex. For anyone also confused: https://codex.top/.

@michael-schwarz
Copy link
Copy Markdown
Member

There's also one we lose that is unrelated to timing changes. Maybe worth taking a look, I guess this should behave exactly as the old one apart from runtime / memory concerns.

@sim642
Copy link
Copy Markdown
Member Author

sim642 commented Apr 17, 2026

There's also one we lose that is unrelated to timing changes. Maybe worth taking a look, I guess this should behave exactly as the old one apart from runtime / memory concerns.

I also noticed that. It's from #1984 (comment).

To fix tests where the inner operator is nonidempotent due to int domain refinement.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

analyze-that bug performance Analysis time, memory usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Excessive allocations during map join in refinement by ambiguous pointer

2 participants