LMDB based MerkleStore implementation#537
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an LMDB-backed Merkle-node backend with workspace deps, module wiring, versioned on-disk formats, zero-copy mmap reads, buffered write sessions that commit atomically, reader/writer trait implementations, and supporting logging/tests/refactors. ChangesLMDB Backend Implementation
Sequence Diagram(s)sequenceDiagram
participant App as Application
participant Writer as MerkleWriter
participant Session as LmdbWriteSession
participant NodeSess as LmdbNodeWriteSession
participant Queue as Pending<br/>Queue
participant Txn as LMDB<br/>RwTxn
App->>Writer: begin()
activate Writer
Writer->>Queue: new Queue
Writer->>Session: LmdbWriteSession { Queue }
Writer-->>App: Session
deactivate Writer
App->>Session: begin_node(parent_id)
activate Session
Session->>NodeSess: new(parent_id)
Session-->>App: NodeSession
deactivate Session
App->>NodeSess: add_child(hash, node)
activate NodeSess
NodeSess->>NodeSess: serialize & buffer
deactivate NodeSess
App->>NodeSess: finish()
activate NodeSess
NodeSess->>Queue: enqueue PendingWrite
deactivate NodeSess
App->>Session: finish()
activate Session
Session->>Queue: drain all writes
Session->>Txn: RwTxn::new()
activate Txn
loop for each write
Session->>Txn: write node -> merkle_tree_nodes
Session->>Txn: write link -> merkle_links
end
Session->>Txn: commit()
deactivate Txn
Session-->>App: Result
deactivate Session
classDiagram
class LmdbBackend {
-repo_root: PathBuf
-env: Env
-merkle_tree_nodes: Database
-merkle_links: Database
+new(repo_root, options) Result
+full_exists(hash) Result~bool~
+full_get_node(hash) Result~Option~LmdbNode~~
+get_links(hash) Result~Option~LmdbLink~~
}
class LmdbNode {
-kind: MerkleTreeNodeType
-data: Vec~u8~
+kind() MerkleTreeNodeType
+data() &[u8]
+encode() Vec~u8~
+decode(bytes) Result~Self~
}
class LmdbLink {
-parent_id: Option~MerkleHash~
-children: Vec~MerkleHash~
+parent_id() Option~MerkleHash~
+children() &[MerkleHash]
+encode() Vec~u8~
+decode(bytes) Result~Self~
}
class MerkleReader {
+exists(hash) Result~bool~
+get_node(hash) Result~Option~MerkleEntry~~
+get_children(hash) Result~Vec~(MerkleHash, MerkleTreeNode)~~
}
class MerkleWriter {
+begin() Result~Box~dyn MerkleWriteSession~~
}
LmdbBackend --> LmdbNode: serializes/deserializes
LmdbBackend --> LmdbLink: serializes/deserializes
LmdbBackend ..|> MerkleReader: implements
LmdbBackend ..|> MerkleWriter: implements
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Comment |
|
NOTE: Stacked PR! #531 must be merged first! |
7dc9c8b to
d1ae637
Compare
344e76d to
022ba55
Compare
|
NOTE: Stacked PR! #531 must be merged first!! |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/lib/src/core/db/merkle_node/lmdb/writer.rs (1)
62-62: 💤 Low valueMinor typo in documentation.
Arc<Mutuex<.>>should beArc<Mutex<...>>.📝 Proposed fix
-/// multi-threading is needed one day, then this can be migrated to an [`Arc<Mutuex<.>>`]. +/// multi-threading is needed one day, then this can be migrated to an [`Arc<Mutex<...>>`].🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/lib/src/core/db/merkle_node/lmdb/writer.rs` at line 62, Fix the typo in the doc comment inside writer.rs where it reads "Arc<Mutuex<.>>": replace it with the correct "Arc<Mutex<...>>" (correcting "Mutuex" to "Mutex" and using "..." or "..."-style generics instead of "."). Update the comment near the multi-threading note so it reads e.g. "Arc<Mutex<...>>" to accurately reference the Rust Mutex type.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/lib/src/core/db/merkle_node/lmdb/writer.rs`:
- Line 62: Fix the typo in the doc comment inside writer.rs where it reads
"Arc<Mutuex<.>>": replace it with the correct "Arc<Mutex<...>>" (correcting
"Mutuex" to "Mutex" and using "..." or "..."-style generics instead of ".").
Update the comment near the multi-threading note so it reads e.g.
"Arc<Mutex<...>>" to accurately reference the Rust Mutex type.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 34cff33f-7c5e-408e-bebc-7fc0a2ddc664
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
.gitignoreCargo.tomlcrates/lib/Cargo.tomlcrates/lib/src/core/db/merkle_node.rscrates/lib/src/core/db/merkle_node/lmdb.rscrates/lib/src/core/db/merkle_node/lmdb/lmdb_backend.rscrates/lib/src/core/db/merkle_node/lmdb/reader.rscrates/lib/src/core/db/merkle_node/lmdb/value_structs.rscrates/lib/src/core/db/merkle_node/lmdb/writer.rscrates/lib/src/core/db/merkle_node/merkle_node_db.rscrates/lib/src/core/v_latest/index/commit_merkle_tree.rscrates/lib/src/error.rscrates/lib/src/model/merkle_tree/node/commit_node.rscrates/lib/src/model/merkle_tree/node/dir_node.rscrates/lib/src/model/merkle_tree/node/file_chunk_node.rscrates/lib/src/model/merkle_tree/node/file_node.rscrates/lib/src/model/merkle_tree/node/vnode.rscrates/lib/src/repositories/commits/commit_writer.rs
✅ Files skipped from review due to trivial changes (6)
- crates/lib/src/model/merkle_tree/node/file_chunk_node.rs
- crates/lib/src/model/merkle_tree/node/dir_node.rs
- .gitignore
- crates/lib/src/model/merkle_tree/node/file_node.rs
- crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
- crates/lib/src/core/db/merkle_node/merkle_node_db.rs
🚧 Files skipped from review as they are similar to previous changes (8)
- crates/lib/src/model/merkle_tree/node/commit_node.rs
- Cargo.toml
- crates/lib/src/error.rs
- crates/lib/src/core/db/merkle_node.rs
- crates/lib/src/core/db/merkle_node/lmdb/reader.rs
- crates/lib/src/core/db/merkle_node/lmdb/value_structs.rs
- crates/lib/src/repositories/commits/commit_writer.rs
- crates/lib/src/core/db/merkle_node/lmdb/lmdb_backend.rs
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/lib/src/core/db/merkle_node/lmdb/writer.rs`:
- Around line 154-156: The doc comment on NodeWriteSession incorrectly asserts
that get_node will return file hashes; update the comment to reflect actual
behavior: clarify that although file nodes may be stored by this writer
(contrast with super::super::file_backend's NodeWriteSession),
MerkleReader::get_node intentionally returns None for File and FileChunk types,
so callers should not rely on get_node to retrieve file or file-chunk nodes.
Mention the specific symbols MerkleReader, get_node, File, and FileChunk to make
the behavior explicit and prevent misuse.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e87c0e5f-d92f-4444-b64b-1e70e2863426
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (18)
.gitignoreCargo.tomlcrates/lib/Cargo.tomlcrates/lib/src/core/db/merkle_node.rscrates/lib/src/core/db/merkle_node/lmdb.rscrates/lib/src/core/db/merkle_node/lmdb/lmdb_backend.rscrates/lib/src/core/db/merkle_node/lmdb/reader.rscrates/lib/src/core/db/merkle_node/lmdb/value_structs.rscrates/lib/src/core/db/merkle_node/lmdb/writer.rscrates/lib/src/core/db/merkle_node/merkle_node_db.rscrates/lib/src/core/v_latest/index/commit_merkle_tree.rscrates/lib/src/error.rscrates/lib/src/model/merkle_tree/node/commit_node.rscrates/lib/src/model/merkle_tree/node/dir_node.rscrates/lib/src/model/merkle_tree/node/file_chunk_node.rscrates/lib/src/model/merkle_tree/node/file_node.rscrates/lib/src/model/merkle_tree/node/vnode.rscrates/lib/src/repositories/commits/commit_writer.rs
✅ Files skipped from review due to trivial changes (7)
- crates/lib/src/model/merkle_tree/node/file_chunk_node.rs
- crates/lib/src/model/merkle_tree/node/vnode.rs
- .gitignore
- crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
- Cargo.toml
- crates/lib/src/model/merkle_tree/node/commit_node.rs
- crates/lib/src/model/merkle_tree/node/file_node.rs
🚧 Files skipped from review as they are similar to previous changes (7)
- crates/lib/Cargo.toml
- crates/lib/src/core/db/merkle_node.rs
- crates/lib/src/core/db/merkle_node/merkle_node_db.rs
- crates/lib/src/core/db/merkle_node/lmdb.rs
- crates/lib/src/core/db/merkle_node/lmdb/value_structs.rs
- crates/lib/src/core/db/merkle_node/lmdb/lmdb_backend.rs
- crates/lib/src/repositories/commits/commit_writer.rs
c5fa4b6 to
c5d62fe
Compare
bb27866 to
1073eb7
Compare
c5d62fe to
b881249
Compare
1073eb7 to
c847871
Compare
b881249 to
f33799a
Compare
c847871 to
7148a67
Compare
f33799a to
6353feb
Compare
7148a67 to
cba726a
Compare
CleanCut
left a comment
There was a problem hiding this comment.
add_child
| } | ||
|
|
||
| /// Serialize this child and queue for writing. | ||
| fn add_child(&mut self, child: &dyn TMerkleTreeNode) -> Result<(), OxenError> { |
There was a problem hiding this comment.
There was a problem hiding this comment.
It's set by creating one of these sessions. It's created via session.create_node(parent) -> NodeWriteSession. It goes MerkleWriteSession -> (many) NodeWriteSessions. This add_child is on the NodeWriteSession. The finish() here gets the parent ID out of self as it pushes the PendingWrites into the queue.
| /// NOTE: to comply with [`MerkleReader::get_node`]'s semantics, this method | ||
| /// has to consider present file nodes as not existing. |
There was a problem hiding this comment.
I'm trying to wrap my head around this. What is the reason get_node returns None even when a file node exists? It's a bit confusing on first read. If the file implementation is the source of this behavior, would it be worth changing it in this PR so it is more intuitive?
There was a problem hiding this comment.
It's unintuitive! But it's here to fit how the existing oxen code expects this to work.
First, to address --
would it be worth changing it in this PR so it is more intuitive?
Yes. 100%. What this should do is have get_node return you a node full stop. This is why I put in a get_node_full method for LmdbBackend. When the time comes, we'll just delete and rename. (I also did this to document the "sane" way of doing this :) ).
Now, why is it this way? I asked myself this a ton while working on this.
Because the rest of oxen is somewhat hardocded under the premise that Merkle tree nodes are stored in these node and children files that live under a directory (the {hash prefix}/{hash suffix}/ dir).
If there was a node file for every actual file that's versioned, we'd have way too many files. Both in terms of managing (We might hit into some OS limits at the large end of repos too!) And in terms of access times.
So the original file backend for the Merkle tree made a design decision: FileNodes are actually only ever stored in children files. node files only store a DirNode, a CommitNode, or a VNode. There's only a create_node on these subtypes. Every time we see a file (or FileChunkNode) we are writing it in some add_children call. Similarly, the code only calls get_node on non-files. Files are looked up by calling get_children on its directory vnode.
Files are always accessed by doing:
- get the directory
- figure out the vnode for the file
- do
get_nodeon this - get the offset table (from the
nodefile) - seek to the position of the desired file in the
childrenfile - read & deserialize the file node from this position
What I really think the code should do is just call get_node. There shouldn't be any need to call get_children to get a file node IMO!
Doing this right now would be a really big ask though. I'd have to audit the entire codebase and make sure I track down all of these calls and change them correctly. That's a bit too much scope creep for this initial large refactor + LMDB implementation 😅 Something that's very much on my radar though!
| /// The filesystem location of the local repository. | ||
| pub(super) repo_root: PathBuf, | ||
| /// The LMDB environment that contains the [`Database`] fields. | ||
| pub(super) lmdb_env: Env<WithTls>, // note: WithTls makes this !Send. Use AnyTls if need to send between threads. |
There was a problem hiding this comment.
Nit: A small thought on the comment — Env<WithTls> doesn't seem to make LmdbBackend !Send, since the impl MerkleReader for LmdbBackend below compiles fine even with pub trait MerkleReader: Send + Sync. Could it be that you were thinking of the transactions created from Env<WithTls>? Those are !Send, though I'm not totally sure we need to mention it.
There was a problem hiding this comment.
Parameter defining that read transactions are opened with Thread Local Storage (TLS) and cannot be sent between threads
!Send. It is often faster to open TLS-backed transactions.
6353feb to
9924585
Compare
cba726a to
c8b75dd
Compare
9924585 to
d599856
Compare
c8b75dd to
a7c14ef
Compare
d599856 to
bb81b05
Compare
a7c14ef to
92f07ff
Compare
f22d8d3 to
158fd2a
Compare
92f07ff to
24d66c5
Compare
4804bb4 to
4935b29
Compare
Adds the `heed` crate to provide access LMDB (Lightning Memory-Mapped Database). Creates a new `MerkleStore` implementation using LMDB as `LmdbBackend` under the new `core::db::merkle_node::lmdb` package in `liboxen`. Extensive new tests have been added to ensure that the memory layout of LMDB values is consistent and that LMDB operations work as expected. **LMDB Store Design** The `LmdbBackend` uses two tables to store all Merkle tree nodes: 1. `merkle_tree_nodes`: u128 -> ~EMerkleTreeNode 2. `merkle_links`: u128 -> parent(u128) + []children(u128) (1) stores the actual Merkle tree node struct. It has the type and the msgpack serialized bytes for the `EMerkleTreeNode`. To maintain backwards compatability, the `EMerkleTreeNode`'s serialized representation is used as-is and _not_ modified. (Modification would require a migration). (2) stores the connections that dictate the structure of the Merkle tree. Each node maps to a `LmdbLink`, which is an optional parent connection and a list of the node's children. Each of these is are `MerkleHash`es: they're stored as 16 byte `u128` values. **`zerocopy` uses** The `zerocopy` dependency has been added as the `LmdbBackend` offers full zero-copy support for read operations. These are implemented using methods on the `LmdbBackend` struct itself. The `MerkleReader` operations require owned data, so these views have to be copied to comply with the trait design. However, this opens the door in the future to iterating on the trait to return borrows on the underlying data. Each internal table has its own zerocopy view: `LmdbNodeRef` for `LmdbNode` and `LmdbLinkRef` for `LmdbLink`. The borrows last as long as the lifetime for the read transaction because they are direct views into LMDB's internal memory-mapped pages. **`MerkleReader` implementation** The `LmdbBackend` actually stores `FileNode` and `FileChunkNode` Merkle tree nodes in its store directly. This diverges from the `FileBackend`, where, for better file access patterns and to reduce inode pressure, file nodes are only stored in the `children` file and require parsing the lookup table from the `node` file. To ensure that `LmdbBackend` adheres to the constraints of `MerkleReader`, the `get_node` and `exists` methods treat file nodes as not being present. However, the `LmdbBackend` struct provides `full_exists` & `full_get_node` which work correctly on actually stored file and file chunk nodes. **`MerkleWriter` implementation** LMDB encourages the use of short-lived transactions. Writing into LMDB directly buffers data in memory (via memory-mapped pages). Closing a transaction requires an fsync, which is an expensive syscall. The writer implementation explicitly buffers written nodes and children via a `Cell<Vec<.>>`. The enclosing write session's `finish` performs the actual write to LMDB. Node that the node write session _does not_ actually ensure that writes are persisted to LMDB, as this would incur a performance penalty via fsync.
4935b29 to
bc59215
Compare
Adds the
heedcrate to provide access LMDB (Lightning Memory-MappedDatabase). Creates a new
MerkleStoreimplementation using LMDB asLmdbBackendunder the newcore::db::merkle_node::lmdbpackage inliboxen. Extensive new tests have been added to ensure that thememory layout of LMDB values is consistent and that LMDB operations
work as expected.
LMDB Store Design
The
LmdbBackenduses two tables to store all Merkle tree nodes:merkle_tree_nodes: u128 -> ~EMerkleTreeNodemerkle_links: u128 -> parent(u128) + []children(u128)(1) stores the actual Merkle tree node struct. It has the type and
the msgpack serialized bytes for the
EMerkleTreeNode. To maintainbackwards compatability, the
EMerkleTreeNode's serialized representationis used as-is and not modified. (Modification would require a migration).
(2) stores the connections that dictate the structure of the Merkle tree.
Each node maps to a
LmdbLink, which is an optional parent connectionand a list of the node's children. Each of these is are
MerkleHashes:they're stored as 16 byte
u128values.zerocopyusesThe
zerocopydependency has been added as theLmdbBackendoffers fullzero-copy support for read operations. These are implemented using methods
on the
LmdbBackendstruct itself. TheMerkleReaderoperations requireowned data, so these views have to be copied to comply with the trait design.
However, this opens the door in the future to iterating on the trait to
return borrows on the underlying data.
Each internal table has its own zerocopy view:
LmdbNodeRefforLmdbNodeand
LmdbLinkRefforLmdbLink. The borrows last as long as the lifetimefor the read transaction because they are direct views into LMDB's internal
memory-mapped pages.
MerkleReaderimplementationThe
LmdbBackendactually storesFileNodeandFileChunkNodeMerkle treenodes in its store directly. This diverges from the
FileBackend, where,for better file access patterns and to reduce inode pressure, file nodes are
only stored in the
childrenfile and require parsing the lookup table fromthe
nodefile.To ensure that
LmdbBackendadheres to the constraints ofMerkleReader,the
get_nodeandexistsmethods treat file nodes as not being present.However, the
LmdbBackendstruct providesfull_exists&full_get_nodewhich work correctly on actually stored file and file chunk nodes.
MerkleWriterimplementationLMDB encourages the use of short-lived transactions. Writing into LMDB
directly buffers data in memory (via memory-mapped pages). Closing a transaction
requires an fsync, which is an expensive syscall. The writer implementation
explicitly buffers written nodes and children via a
Cell<Vec<.>>. The enclosingwrite session's
finishperforms the actual write to LMDB. Node that the nodewrite session does not actually ensure that writes are persisted to LMDB,
as this would incur a performance penalty via fsync.