Skip to content

LMDB based MerkleStore implementation#537

Merged
malcolmgreaves merged 1 commit into
mainfrom
mg/merkle_lmdb_impl
May 15, 2026
Merged

LMDB based MerkleStore implementation#537
malcolmgreaves merged 1 commit into
mainfrom
mg/merkle_lmdb_impl

Conversation

@malcolmgreaves
Copy link
Copy Markdown
Collaborator

@malcolmgreaves malcolmgreaves commented May 7, 2026

Adds the heed crate to provide access LMDB (Lightning Memory-Mapped
Database). Creates a new MerkleStore implementation using LMDB as
LmdbBackend under the new core::db::merkle_node::lmdb package in
liboxen. Extensive new tests have been added to ensure that the
memory layout of LMDB values is consistent and that LMDB operations
work as expected.

LMDB Store Design
The LmdbBackend uses two tables to store all Merkle tree nodes:

  1. merkle_tree_nodes: u128 -> ~EMerkleTreeNode
  2. merkle_links: u128 -> parent(u128) + []children(u128)

(1) stores the actual Merkle tree node struct. It has the type and
the msgpack serialized bytes for the EMerkleTreeNode. To maintain
backwards compatability, the EMerkleTreeNode's serialized representation
is used as-is and not modified. (Modification would require a migration).

(2) stores the connections that dictate the structure of the Merkle tree.
Each node maps to a LmdbLink, which is an optional parent connection
and a list of the node's children. Each of these is are MerkleHashes:
they're stored as 16 byte u128 values.

zerocopy uses
The zerocopy dependency has been added as the LmdbBackend offers full
zero-copy support for read operations. These are implemented using methods
on the LmdbBackend struct itself. The MerkleReader operations require
owned data, so these views have to be copied to comply with the trait design.
However, this opens the door in the future to iterating on the trait to
return borrows on the underlying data.

Each internal table has its own zerocopy view: LmdbNodeRef for LmdbNode
and LmdbLinkRef for LmdbLink. The borrows last as long as the lifetime
for the read transaction because they are direct views into LMDB's internal
memory-mapped pages.

MerkleReader implementation
The LmdbBackend actually stores FileNode and FileChunkNode Merkle tree
nodes in its store directly. This diverges from the FileBackend, where,
for better file access patterns and to reduce inode pressure, file nodes are
only stored in the children file and require parsing the lookup table from
the node file.

To ensure that LmdbBackend adheres to the constraints of MerkleReader,
the get_node and exists methods treat file nodes as not being present.

However, the LmdbBackend struct provides full_exists & full_get_node
which work correctly on actually stored file and file chunk nodes.

MerkleWriter implementation
LMDB encourages the use of short-lived transactions. Writing into LMDB
directly buffers data in memory (via memory-mapped pages). Closing a transaction
requires an fsync, which is an expensive syscall. The writer implementation
explicitly buffers written nodes and children via a Cell<Vec<.>>. The enclosing
write session's finish performs the actual write to LMDB. Node that the node
write session does not actually ensure that writes are persisted to LMDB,
as this would incur a performance penalty via fsync.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an LMDB-backed Merkle-node backend with workspace deps, module wiring, versioned on-disk formats, zero-copy mmap reads, buffered write sessions that commit atomically, reader/writer trait implementations, and supporting logging/tests/refactors.

Changes

LMDB Backend Implementation

Layer / File(s) Summary
Dependencies & Error Handling
Cargo.toml, crates/lib/Cargo.toml, crates/lib/src/error.rs
Adds heed and zerocopy workspace dependencies and introduces OxenError::Lmdb variant for automatic conversion.
Module Wiring & Public API
crates/lib/src/core/db/merkle_node.rs, crates/lib/src/core/db/merkle_node/lmdb.rs
Declares LMDB submodules; publicly re-exports FileBackend and LmdbBackend; defines LmdbError enum covering header/link validation, version mismatches, LMDB/heed transport errors, and integrity violations; includes test helpers.
Versioned On-Disk Data Model
crates/lib/src/core/db/merkle_node/lmdb/value_structs.rs
Defines fixed-size headers (LmdbNodeHeaderV1, LmdbLinkHeaderV1) with magic and version tags; provides borrowed zero-copy readers (LmdbNodeRef, LmdbLinkRef) and owned write-side types (LmdbNode, LmdbLink) with encode/decode and extensive tests for layout invariants and error cases.
Backend Storage & Transactions
crates/lib/src/core/db/merkle_node/lmdb/lmdb_backend.rs
Implements LmdbBackend managing LMDB environment and two persisted tables; provides zero-copy retrieval via mmap borrowing, key existence checks, and serialized writes; exposes full_exists, full_get_node, and get_links.
Reader Implementation
crates/lib/src/core/db/merkle_node/lmdb/reader.rs
Implements MerkleReader for LmdbBackend with per-call short-lived read transactions and zero-copy reads; trait methods treat file nodes as absent while full APIs return file nodes; link records reconstruct parent/child relationships with integrity errors when references are missing.
Writer Implementation
crates/lib/src/core/db/merkle_node/lmdb/writer.rs
Implements MerkleWriter with LmdbWriteSession buffering writes until finish, which opens one heed::RwTxn to atomically write node and link records; LmdbNodeWriteSession buffers children and defers I/O; includes comprehensive tests covering queueing, ordering, idempotency, and reader compatibility.
Supporting Changes
.gitignore, crates/lib/src/core/db/merkle_node/merkle_node_db.rs, crates/lib/src/core/v_latest/index/commit_merkle_tree.rs, crates/lib/src/model/merkle_tree/node/*.rs, crates/lib/src/repositories/commits/commit_writer.rs
Adds .claude/*.lock ignore rule; enables trace-level logging for node writes and tree reads; adds #[inline(always)] hints to several deserialize methods; renames a regression-test helper and updates doc comments and small whitespace edits.

Sequence Diagram(s)

sequenceDiagram
  participant App as Application
  participant Writer as MerkleWriter
  participant Session as LmdbWriteSession
  participant NodeSess as LmdbNodeWriteSession
  participant Queue as Pending<br/>Queue
  participant Txn as LMDB<br/>RwTxn
  
  App->>Writer: begin()
  activate Writer
  Writer->>Queue: new Queue
  Writer->>Session: LmdbWriteSession { Queue }
  Writer-->>App: Session
  deactivate Writer
  
  App->>Session: begin_node(parent_id)
  activate Session
  Session->>NodeSess: new(parent_id)
  Session-->>App: NodeSession
  deactivate Session
  
  App->>NodeSess: add_child(hash, node)
  activate NodeSess
  NodeSess->>NodeSess: serialize & buffer
  deactivate NodeSess
  
  App->>NodeSess: finish()
  activate NodeSess
  NodeSess->>Queue: enqueue PendingWrite
  deactivate NodeSess
  
  App->>Session: finish()
  activate Session
  Session->>Queue: drain all writes
  Session->>Txn: RwTxn::new()
  activate Txn
  loop for each write
    Session->>Txn: write node -> merkle_tree_nodes
    Session->>Txn: write link -> merkle_links
  end
  Session->>Txn: commit()
  deactivate Txn
  Session-->>App: Result
  deactivate Session
Loading
classDiagram
  class LmdbBackend {
    -repo_root: PathBuf
    -env: Env
    -merkle_tree_nodes: Database
    -merkle_links: Database
    +new(repo_root, options) Result
    +full_exists(hash) Result~bool~
    +full_get_node(hash) Result~Option~LmdbNode~~
    +get_links(hash) Result~Option~LmdbLink~~
  }
  
  class LmdbNode {
    -kind: MerkleTreeNodeType
    -data: Vec~u8~
    +kind() MerkleTreeNodeType
    +data() &[u8]
    +encode() Vec~u8~
    +decode(bytes) Result~Self~
  }
  
  class LmdbLink {
    -parent_id: Option~MerkleHash~
    -children: Vec~MerkleHash~
    +parent_id() Option~MerkleHash~
    +children() &[MerkleHash]
    +encode() Vec~u8~
    +decode(bytes) Result~Self~
  }
  
  class MerkleReader {
    +exists(hash) Result~bool~
    +get_node(hash) Result~Option~MerkleEntry~~
    +get_children(hash) Result~Vec~(MerkleHash, MerkleTreeNode)~~
  }
  
  class MerkleWriter {
    +begin() Result~Box~dyn MerkleWriteSession~~
  }
  
  LmdbBackend --> LmdbNode: serializes/deserializes
  LmdbBackend --> LmdbLink: serializes/deserializes
  LmdbBackend ..|> MerkleReader: implements
  LmdbBackend ..|> MerkleWriter: implements
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Oxen-AI/Oxen#510: Yes—main PR adds an LmdbBackend implementation of the MerkleReader/MerkleWriter interfaces and its LmdbError wiring, directly building on the merkle-tree reader/writer traits introduced in PR #510.
  • Oxen-AI/Oxen#406: The LMDB backend’s writer/reader use of &dyn TMerkleTreeNode and node serialization interacts with the object-safe TMerkleTreeNode changes in PR #406.

Suggested reviewers

  • CleanCut
  • gschoeni

Poem

A rabbit in a data glen, 🐇
mmap fields hum, headers penned,
queued writes nap in tidy rows,
roots awake where merkle grows,
I hop and guard each byte they send.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely describes the main change: implementing an LMDB-based MerkleStore backend. It directly corresponds to the substantial new code in lmdb.rs, lmdb_backend.rs, reader.rs, value_structs.rs, and writer.rs files.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, explaining the LMDB design, table structure, zerocopy usage, and MerkleReader/MerkleWriter implementations. It accurately reflects the actual changes made.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch mg/merkle_lmdb_impl

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

@malcolmgreaves
Copy link
Copy Markdown
Collaborator Author

NOTE: Stacked PR! #531 must be merged first!

@malcolmgreaves malcolmgreaves force-pushed the mg/reorder_merkle_write_ops branch 2 times, most recently from 7dc9c8b to d1ae637 Compare May 8, 2026 00:16
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_lmdb_impl branch 24 times, most recently from 344e76d to 022ba55 Compare May 11, 2026 22:51
@malcolmgreaves malcolmgreaves marked this pull request as ready for review May 11, 2026 22:53
@malcolmgreaves
Copy link
Copy Markdown
Collaborator Author

NOTE: Stacked PR! #531 must be merged first!!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/lib/src/core/db/merkle_node/lmdb/writer.rs (1)

62-62: 💤 Low value

Minor typo in documentation.

Arc<Mutuex<.>> should be Arc<Mutex<...>>.

📝 Proposed fix
-/// multi-threading is needed one day, then this can be migrated to an [`Arc<Mutuex<.>>`].
+/// multi-threading is needed one day, then this can be migrated to an [`Arc<Mutex<...>>`].
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/lib/src/core/db/merkle_node/lmdb/writer.rs` at line 62, Fix the typo
in the doc comment inside writer.rs where it reads "Arc<Mutuex<.>>": replace it
with the correct "Arc<Mutex<...>>" (correcting "Mutuex" to "Mutex" and using
"..." or "..."-style generics instead of "."). Update the comment near the
multi-threading note so it reads e.g. "Arc<Mutex<...>>" to accurately reference
the Rust Mutex type.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@crates/lib/src/core/db/merkle_node/lmdb/writer.rs`:
- Line 62: Fix the typo in the doc comment inside writer.rs where it reads
"Arc<Mutuex<.>>": replace it with the correct "Arc<Mutex<...>>" (correcting
"Mutuex" to "Mutex" and using "..." or "..."-style generics instead of ".").
Update the comment near the multi-threading note so it reads e.g.
"Arc<Mutex<...>>" to accurately reference the Rust Mutex type.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 34cff33f-7c5e-408e-bebc-7fc0a2ddc664

📥 Commits

Reviewing files that changed from the base of the PR and between 4dcda45 and caa630a.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (18)
  • .gitignore
  • Cargo.toml
  • crates/lib/Cargo.toml
  • crates/lib/src/core/db/merkle_node.rs
  • crates/lib/src/core/db/merkle_node/lmdb.rs
  • crates/lib/src/core/db/merkle_node/lmdb/lmdb_backend.rs
  • crates/lib/src/core/db/merkle_node/lmdb/reader.rs
  • crates/lib/src/core/db/merkle_node/lmdb/value_structs.rs
  • crates/lib/src/core/db/merkle_node/lmdb/writer.rs
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
  • crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
  • crates/lib/src/error.rs
  • crates/lib/src/model/merkle_tree/node/commit_node.rs
  • crates/lib/src/model/merkle_tree/node/dir_node.rs
  • crates/lib/src/model/merkle_tree/node/file_chunk_node.rs
  • crates/lib/src/model/merkle_tree/node/file_node.rs
  • crates/lib/src/model/merkle_tree/node/vnode.rs
  • crates/lib/src/repositories/commits/commit_writer.rs
✅ Files skipped from review due to trivial changes (6)
  • crates/lib/src/model/merkle_tree/node/file_chunk_node.rs
  • crates/lib/src/model/merkle_tree/node/dir_node.rs
  • .gitignore
  • crates/lib/src/model/merkle_tree/node/file_node.rs
  • crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
🚧 Files skipped from review as they are similar to previous changes (8)
  • crates/lib/src/model/merkle_tree/node/commit_node.rs
  • Cargo.toml
  • crates/lib/src/error.rs
  • crates/lib/src/core/db/merkle_node.rs
  • crates/lib/src/core/db/merkle_node/lmdb/reader.rs
  • crates/lib/src/core/db/merkle_node/lmdb/value_structs.rs
  • crates/lib/src/repositories/commits/commit_writer.rs
  • crates/lib/src/core/db/merkle_node/lmdb/lmdb_backend.rs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/lib/src/core/db/merkle_node/lmdb/writer.rs`:
- Around line 154-156: The doc comment on NodeWriteSession incorrectly asserts
that get_node will return file hashes; update the comment to reflect actual
behavior: clarify that although file nodes may be stored by this writer
(contrast with super::super::file_backend's NodeWriteSession),
MerkleReader::get_node intentionally returns None for File and FileChunk types,
so callers should not rely on get_node to retrieve file or file-chunk nodes.
Mention the specific symbols MerkleReader, get_node, File, and FileChunk to make
the behavior explicit and prevent misuse.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e87c0e5f-d92f-4444-b64b-1e70e2863426

📥 Commits

Reviewing files that changed from the base of the PR and between caa630a and bb27866.

⛔ Files ignored due to path filters (1)
  • Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (18)
  • .gitignore
  • Cargo.toml
  • crates/lib/Cargo.toml
  • crates/lib/src/core/db/merkle_node.rs
  • crates/lib/src/core/db/merkle_node/lmdb.rs
  • crates/lib/src/core/db/merkle_node/lmdb/lmdb_backend.rs
  • crates/lib/src/core/db/merkle_node/lmdb/reader.rs
  • crates/lib/src/core/db/merkle_node/lmdb/value_structs.rs
  • crates/lib/src/core/db/merkle_node/lmdb/writer.rs
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
  • crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
  • crates/lib/src/error.rs
  • crates/lib/src/model/merkle_tree/node/commit_node.rs
  • crates/lib/src/model/merkle_tree/node/dir_node.rs
  • crates/lib/src/model/merkle_tree/node/file_chunk_node.rs
  • crates/lib/src/model/merkle_tree/node/file_node.rs
  • crates/lib/src/model/merkle_tree/node/vnode.rs
  • crates/lib/src/repositories/commits/commit_writer.rs
✅ Files skipped from review due to trivial changes (7)
  • crates/lib/src/model/merkle_tree/node/file_chunk_node.rs
  • crates/lib/src/model/merkle_tree/node/vnode.rs
  • .gitignore
  • crates/lib/src/core/v_latest/index/commit_merkle_tree.rs
  • Cargo.toml
  • crates/lib/src/model/merkle_tree/node/commit_node.rs
  • crates/lib/src/model/merkle_tree/node/file_node.rs
🚧 Files skipped from review as they are similar to previous changes (7)
  • crates/lib/Cargo.toml
  • crates/lib/src/core/db/merkle_node.rs
  • crates/lib/src/core/db/merkle_node/merkle_node_db.rs
  • crates/lib/src/core/db/merkle_node/lmdb.rs
  • crates/lib/src/core/db/merkle_node/lmdb/value_structs.rs
  • crates/lib/src/core/db/merkle_node/lmdb/lmdb_backend.rs
  • crates/lib/src/repositories/commits/commit_writer.rs

Comment thread crates/lib/src/core/db/merkle_node/lmdb/writer.rs Outdated
@malcolmgreaves malcolmgreaves force-pushed the mg/reorder_merkle_write_ops branch from c5fa4b6 to c5d62fe Compare May 13, 2026 20:54
@malcolmgreaves malcolmgreaves force-pushed the mg/reorder_merkle_write_ops branch from c5d62fe to b881249 Compare May 14, 2026 03:41
@malcolmgreaves malcolmgreaves force-pushed the mg/reorder_merkle_write_ops branch from b881249 to f33799a Compare May 14, 2026 19:20
@malcolmgreaves malcolmgreaves force-pushed the mg/reorder_merkle_write_ops branch from f33799a to 6353feb Compare May 14, 2026 22:05
Copy link
Copy Markdown
Contributor

@CleanCut CleanCut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Double-check that a child's parent gets set when you call add_child

}

/// Serialize this child and queue for writing.
fn add_child(&mut self, child: &dyn TMerkleTreeNode) -> Result<(), OxenError> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ What sets the child's parent?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's set by creating one of these sessions. It's created via session.create_node(parent) -> NodeWriteSession. It goes MerkleWriteSession -> (many) NodeWriteSessions. This add_child is on the NodeWriteSession. The finish() here gets the parent ID out of self as it pushes the PendingWrites into the queue.

Comment on lines +39 to +40
/// NOTE: to comply with [`MerkleReader::get_node`]'s semantics, this method
/// has to consider present file nodes as not existing.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to wrap my head around this. What is the reason get_node returns None even when a file node exists? It's a bit confusing on first read. If the file implementation is the source of this behavior, would it be worth changing it in this PR so it is more intuitive?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unintuitive! But it's here to fit how the existing oxen code expects this to work.

First, to address --

would it be worth changing it in this PR so it is more intuitive?

Yes. 100%. What this should do is have get_node return you a node full stop. This is why I put in a get_node_full method for LmdbBackend. When the time comes, we'll just delete and rename. (I also did this to document the "sane" way of doing this :) ).

Now, why is it this way? I asked myself this a ton while working on this.

Because the rest of oxen is somewhat hardocded under the premise that Merkle tree nodes are stored in these node and children files that live under a directory (the {hash prefix}/{hash suffix}/ dir).

If there was a node file for every actual file that's versioned, we'd have way too many files. Both in terms of managing (We might hit into some OS limits at the large end of repos too!) And in terms of access times.

So the original file backend for the Merkle tree made a design decision: FileNodes are actually only ever stored in children files. node files only store a DirNode, a CommitNode, or a VNode. There's only a create_node on these subtypes. Every time we see a file (or FileChunkNode) we are writing it in some add_children call. Similarly, the code only calls get_node on non-files. Files are looked up by calling get_children on its directory vnode.

Files are always accessed by doing:

  1. get the directory
  2. figure out the vnode for the file
  3. do get_node on this
  4. get the offset table (from the node file)
  5. seek to the position of the desired file in the children file
  6. read & deserialize the file node from this position

What I really think the code should do is just call get_node. There shouldn't be any need to call get_children to get a file node IMO!

Doing this right now would be a really big ask though. I'd have to audit the entire codebase and make sure I track down all of these calls and change them correctly. That's a bit too much scope creep for this initial large refactor + LMDB implementation 😅 Something that's very much on my radar though!

/// The filesystem location of the local repository.
pub(super) repo_root: PathBuf,
/// The LMDB environment that contains the [`Database`] fields.
pub(super) lmdb_env: Env<WithTls>, // note: WithTls makes this !Send. Use AnyTls if need to send between threads.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: A small thought on the comment — Env<WithTls> doesn't seem to make LmdbBackend !Send, since the impl MerkleReader for LmdbBackend below compiles fine even with pub trait MerkleReader: Send + Sync. Could it be that you were thinking of the transactions created from Env<WithTls>? Those are !Send, though I'm not totally sure we need to mention it.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parameter defining that read transactions are opened with Thread Local Storage (TLS) and cannot be sent between threads !Send. It is often faster to open TLS-backed transactions.

https://docs.rs/heed/latest/heed/enum.WithTls.html

Comment thread crates/lib/src/core/db/merkle_node/lmdb/lmdb_backend.rs
Comment thread crates/lib/src/core/db/merkle_node/lmdb/writer.rs Outdated
@malcolmgreaves malcolmgreaves force-pushed the mg/reorder_merkle_write_ops branch from 6353feb to 9924585 Compare May 15, 2026 03:32
@malcolmgreaves malcolmgreaves force-pushed the mg/reorder_merkle_write_ops branch from 9924585 to d599856 Compare May 15, 2026 03:47
@malcolmgreaves malcolmgreaves force-pushed the mg/reorder_merkle_write_ops branch from d599856 to bb81b05 Compare May 15, 2026 03:56
@malcolmgreaves malcolmgreaves force-pushed the mg/reorder_merkle_write_ops branch 2 times, most recently from f22d8d3 to 158fd2a Compare May 15, 2026 04:04
Base automatically changed from mg/reorder_merkle_write_ops to main May 15, 2026 04:15
@malcolmgreaves malcolmgreaves force-pushed the mg/merkle_lmdb_impl branch 5 times, most recently from 4804bb4 to 4935b29 Compare May 15, 2026 06:23
Adds the `heed` crate to provide access LMDB (Lightning Memory-Mapped
Database). Creates a new `MerkleStore` implementation using LMDB as
`LmdbBackend` under the new `core::db::merkle_node::lmdb` package in
`liboxen`. Extensive new tests have been added to ensure that the
memory layout of LMDB values is consistent and that LMDB operations
work as expected.

**LMDB Store Design**
The `LmdbBackend` uses two tables to store all Merkle tree nodes:
1. `merkle_tree_nodes`: u128 -> ~EMerkleTreeNode
2. `merkle_links`: u128 -> parent(u128) + []children(u128)

(1) stores the actual Merkle tree node struct. It has the type and
the msgpack serialized bytes for the `EMerkleTreeNode`. To maintain
backwards compatability, the `EMerkleTreeNode`'s serialized representation
is used as-is and _not_ modified. (Modification would require a migration).

(2) stores the connections that dictate the structure of the Merkle tree.
Each node maps to a `LmdbLink`, which is an optional parent connection
and a list of the node's children. Each of these is are `MerkleHash`es:
they're stored as 16 byte `u128` values.

**`zerocopy` uses**
The `zerocopy` dependency has been added as the `LmdbBackend` offers full
zero-copy support for read operations. These are implemented using methods
on the `LmdbBackend` struct itself. The `MerkleReader` operations require
owned data, so these views have to be copied to comply with the trait design.
However, this opens the door in the future to iterating on the trait to
return borrows on the underlying data.

Each internal table has its own zerocopy view: `LmdbNodeRef` for `LmdbNode`
and `LmdbLinkRef` for `LmdbLink`. The borrows last as long as the lifetime
for the read transaction because they are direct views into LMDB's internal
memory-mapped pages.

**`MerkleReader` implementation**
The `LmdbBackend` actually stores `FileNode` and `FileChunkNode` Merkle tree
nodes in its store directly. This diverges from the `FileBackend`, where,
for better file access patterns and to reduce inode pressure, file nodes are
only stored in the `children` file and require parsing the lookup table from
the `node` file.

To ensure that `LmdbBackend` adheres to the constraints of `MerkleReader`,
the `get_node` and `exists` methods treat file nodes as not being present.

However, the `LmdbBackend` struct provides `full_exists` & `full_get_node`
which work correctly on actually stored file and file chunk nodes.

**`MerkleWriter` implementation**
LMDB encourages the use of short-lived transactions. Writing into LMDB
directly buffers data in memory (via memory-mapped pages). Closing a transaction
requires an fsync, which is an expensive syscall. The writer implementation
explicitly buffers written nodes and children via a `Cell<Vec<.>>`. The enclosing
write session's `finish` performs the actual write to LMDB. Node that the node
write session _does not_ actually ensure that writes are persisted to LMDB,
as this would incur a performance penalty via fsync.
@malcolmgreaves malcolmgreaves merged commit 526222d into main May 15, 2026
9 checks passed
@malcolmgreaves malcolmgreaves deleted the mg/merkle_lmdb_impl branch May 15, 2026 19:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants