Switch to using byte offsets as LSNs#2541
Merged
Conversation
Contributor
There was a problem hiding this comment.
⚠️ Performance Alert ⚠️
Possible performance regression was detected for benchmark 'Rust Benchmark'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 2.
| Benchmark suite | Current: d8ed166 | Previous: 9823ef7 | Ratio |
|---|---|---|---|
lotr_graph/num_edges |
5 ns/iter (± 0) |
0 ns/iter (± 0) |
+∞ |
lotr_graph/active edge |
443 ns/iter (± 6) |
216 ns/iter (± 10) |
2.05 |
lotr_graph/num_nodes |
136 ns/iter (± 33) |
1 ns/iter (± 0) |
136 |
lotr_graph/graph_latest |
3 ns/iter (± 0) |
0 ns/iter (± 0) |
+∞ |
lotr_graph_window_100/num_edges |
30 ns/iter (± 0) |
8 ns/iter (± 0) |
3.75 |
lotr_graph_window_100/num_nodes |
181 ns/iter (± 44) |
5 ns/iter (± 0) |
36.20 |
lotr_graph_window_100/has_node_existing |
45 ns/iter (± 0) |
22 ns/iter (± 0) |
2.05 |
lotr_graph_window_10/has_node_existing |
158 ns/iter (± 8) |
62 ns/iter (± 11) |
2.55 |
lotr_graph_window_10/iterate nodes |
36643 ns/iter (± 115) |
11339 ns/iter (± 40) |
3.23 |
lotr_graph_window_10/iterate edges |
97960 ns/iter (± 833) |
48684 ns/iter (± 211) |
2.01 |
lotr_graph_subgraph_10pc/has_edge_existing |
278 ns/iter (± 8) |
93 ns/iter (± 1) |
2.99 |
lotr_graph_subgraph_10pc/active edge |
450 ns/iter (± 7) |
219 ns/iter (± 8) |
2.05 |
lotr_graph_subgraph_10pc/num_nodes |
132 ns/iter (± 27) |
4 ns/iter (± 0) |
33 |
lotr_graph_subgraph_10pc/has_node_existing |
130 ns/iter (± 1) |
34 ns/iter (± 0) |
3.82 |
lotr_graph_subgraph_10pc/has_node_nonexisting |
5 ns/iter (± 0) |
2 ns/iter (± 0) |
2.50 |
lotr_graph_subgraph_10pc/iterate nodes |
3250 ns/iter (± 70) |
839 ns/iter (± 5) |
3.87 |
lotr_graph_subgraph_10pc_windowed/has_node_existing |
156 ns/iter (± 8) |
62 ns/iter (± 14) |
2.52 |
lotr_graph_subgraph_10pc_windowed/iterate nodes |
5079 ns/iter (± 102) |
1365 ns/iter (± 3) |
3.72 |
lotr_graph_window_50_layered/num_edges_temporal |
156622 ns/iter (± 3229) |
70121 ns/iter (± 7586) |
2.23 |
lotr_graph_window_50_layered/num_nodes |
62748 ns/iter (± 886) |
21435 ns/iter (± 536) |
2.93 |
lotr_graph_window_50_layered/has_node_existing |
947 ns/iter (± 82) |
129 ns/iter (± 12) |
7.34 |
lotr_graph_window_50_layered/max_id |
69605 ns/iter (± 5011) |
25556 ns/iter (± 252) |
2.72 |
lotr_graph_window_50_layered/iterate nodes |
139768 ns/iter (± 879) |
19308 ns/iter (± 47) |
7.24 |
lotr_graph_window_50_layered/iterate edges |
197276 ns/iter (± 529) |
83616 ns/iter (± 1318) |
2.36 |
lotr_graph_window_50_layered/graph_latest |
104382 ns/iter (± 983) |
36649 ns/iter (± 916) |
2.85 |
lotr_graph_persistent_window_50_layered/num_edges_temporal |
726213 ns/iter (± 40255) |
192686 ns/iter (± 1569) |
3.77 |
lotr_graph_persistent_window_50_layered/num_nodes |
81405 ns/iter (± 1762) |
31517 ns/iter (± 779) |
2.58 |
lotr_graph_persistent_window_50_layered/has_node_existing |
1128 ns/iter (± 173) |
174 ns/iter (± 83) |
6.48 |
lotr_graph_persistent_window_50_layered/max_id |
90452 ns/iter (± 763) |
38024 ns/iter (± 490) |
2.38 |
lotr_graph_persistent_window_50_layered/iterate nodes |
189264 ns/iter (± 656) |
35886 ns/iter (± 191) |
5.27 |
lotr_graph_persistent_window_50_layered/iterate edges |
200470 ns/iter (± 336) |
84161 ns/iter (± 596) |
2.38 |
lotr_graph_persistent_window_50_layered/iterate_exploded_edges |
3448079 ns/iter (± 5968) |
1659940 ns/iter (± 19402) |
2.08 |
lotr_graph_persistent_window_50_layered/graph_latest |
157766 ns/iter (± 2908) |
57549 ns/iter (± 4809) |
2.74 |
lotr_graph_persistent_window_50_layered_materialise/materialize |
12811812 ns/iter (± 72386) |
5298035 ns/iter (± 147912) |
2.42 |
lotr_graph/proto_encode |
5873552 ns/iter (± 53170) |
1157897 ns/iter (± 73709) |
5.07 |
This comment was automatically generated by workflow using github-action-benchmark.
ljeub-pometry
approved these changes
Apr 22, 2026
Collaborator
ljeub-pometry
left a comment
There was a problem hiding this comment.
Probably shouldn't panic in drop
| .log_checkpoint(latest_lsn_on_disk) | ||
| .expect("Failed to log checkpoint in drop"); | ||
| .log_shutdown_checkpoint() | ||
| .expect("Failed to log shutdown checkpoint in drop"); |
Collaborator
There was a problem hiding this comment.
no panics in drop, this should just be logged
Contributor
Author
There was a problem hiding this comment.
Removed the expect calls; errors are logged and the Drop returns immediately .
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
Log Sequence Numbers used in the WAL are currently simple sequential counters.
By switching to using byte offsets into the underlying WAL file, we can allow for simpler and faster recovery.
Why are the changes needed?
Currently, we always cleanup the WAL file on a clean shutdown, so that there is no need to parse through the full WAL file on reboot. This is unnecessary as WAL files only need to be removed when they reach a particular size on disk.
This PR changes clean shutdowns to record the latest LSN in the WAL stream to disk and seek to that position on reboot. This allows for faster reboot without needing to always remove the WAL file on clean shutdowns.
Does this PR introduce any user-facing change? If yes is this documented?
No
How was this patch tested?
A mix of unit tests and proptests, most of which is in https://github.com/Pometry/pometry-storage/pull/223.
Are there any further changes required?
We will need some sort of background thread that goes in and cleans up outdated WAL files.