Skip to content

● refactor: extract shared DbLifecycle trait from DbV0/DbV1#1050

Open
zancas wants to merge 8 commits intodevfrom
DRY_db
Open

● refactor: extract shared DbLifecycle trait from DbV0/DbV1#1050
zancas wants to merge 8 commits intodevfrom
DRY_db

Conversation

@zancas
Copy link
Copy Markdown
Member

@zancas zancas commented Apr 24, 2026

Fixes: #1049
Fixes: #1033

Move the ~70 LOC of byte-identical lifecycle scaffolding (status,
wait_until_ready, clean_trailing, zaino_db_handler_sleep, shutdown)
and the open_or_create_db helper from db/v0.rs and db/v1.rs into a
single pub(super) trait DbLifecycle and a free fn in the parent db.rs.

DbV0 and DbV1 now impl DbLifecycle via four field getters; their
DbCore::{status, shutdown} delegate to the trait. The duplicated
tokio::select! shutdown/abort block and the three-way
sleep-or-maintenance body now live in one place.

DbBackend::{status, shutdown} in db.rs use fully-qualified DbCore::
paths to disambiguate the two traits now in scope on DbV0/DbV1.

#1033 (notify_one race) preserved verbatim on DbLifecycle::shutdown —
fixing it is now a single-call-site change.

Parent: #862

Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com

…bV1 (#1049)

  Move the ~70 LOC of byte-identical lifecycle scaffolding (status,
  wait_until_ready, clean_trailing, zaino_db_handler_sleep, shutdown)
  and the open_or_create_db helper from db/v0.rs and db/v1.rs into a
  single pub(super) trait DbLifecycle and a free fn in the parent db.rs.

  DbV0 and DbV1 now impl DbLifecycle via four field getters; their
  DbCore::{status, shutdown} delegate to the trait. The duplicated
  tokio::select! shutdown/abort block and the three-way
  sleep-or-maintenance body now live in one place.

  DbBackend::{status, shutdown} in db.rs use fully-qualified DbCore::
  paths to disambiguate the two traits now in scope on DbV0/DbV1.

  #1033 (notify_one race) preserved verbatim on DbLifecycle::shutdown —
  fixing it is now a single-call-site change.

  Parent: #862

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@zancas zancas changed the title ● refactor(zaino-state): extract shared DbLifecycle trait from DbV0/D… ● refactor: extract shared DbLifecycle trait from DbV0/DbV1 Apr 24, 2026
@zancas zancas requested a review from AloeareV April 24, 2026 21:30
zancas added 2 commits April 24, 2026 14:36
… race

  DbLifecycle::shutdown signals waiters with Notify::notify_one, which
  wakes at most one task and stores at most one permit. With N > 1
  tasks awaiting the same shutdown_notify, N-1 are stranded.

  Adds a maximally narrow test in db.rs::shutdown that impls
  DbLifecycle on a minimal FakeDb, spawns N=3 tasks each registering
  interest via Notified::enable + Barrier, then asserts every waiter
  completes within 200ms after shutdown returns.

  As intended, this test uniquely fails among the 226 in-package
  tests (225 passed, 1 failed, 2 skipped). The fix — switching the
  signaling primitive to a wake-all mechanism (watch, flag +
  notify_waiters, CancellationToken) — will flip it green and is
  tracked in #1033.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@zancas
Copy link
Copy Markdown
Member Author

zancas commented Apr 24, 2026

How to fix #1033, I chose tokio_util "option C" below.

Analysis

B (watch::channel) is the minimum-dependency choice. The pattern is already in use a few hundred
lines away in finalised_state.rs:547, so adopting it here makes the codebase more uniform. Waiter
ergonomics (changed() + borrow() or wait_for) are slightly clunkier than C's flat cancelled().await,
but not much. No new direct crate edge.

C (tokio_util::sync::CancellationToken) is the semantic fit — purpose-built for exactly this scenario.
Waiter and notifier APIs are each one line. The decisive feature is child_token(): as DbV2 lands
(#860) and likely spawns multiple sub-tasks (validator, compactor, metrics), each can carry a child
token. Cancelling a child does not propagate up, which contains worker-bug blast radius — a property
watch does not offer without manual scoping. tokio-util is a near-universal companion to tokio; adding
it as a direct edge is cheap.

zancas added 2 commits April 24, 2026 15:33
  Replaces Arc<Notify> + notify_one with tokio_util CancellationToken in
  DbLifecycle. notify_one wakes at most one waiter and stores at most
  one permit, so N>1 background tasks awaiting the same shutdown_notify
  were stranded after shutdown. cancel() wakes every current waiter and
  persists state for late subscribers; cancelled() handles the
  late-poll case without an explicit register-before-wait dance.

  Adds tokio-util = "0.7" to the workspace and zaino-state crate.

  Trait change: shutdown_notify() -> &Arc<Notify> becomes cancel_token()
  -> &CancellationToken. The two consumer sites (DbLifecycle::shutdown
  notifier and zaino_db_handler_sleep waiter) and every Self {...} clone
  literal across db/v0.rs, db/v1.rs, db/v1/write_core.rs, and
  db/v1/compact_block.rs are updated to match.

  Regression test from #1049 (db::shutdown::wakes_every_shutdown_waiter)
  now passes — flips from the unique failure to a green invariant.

  Closes #1033.

  Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@zancas zancas marked this pull request as ready for review April 24, 2026 22:35
@zancas zancas requested review from nachog00 and removed request for AloeareV April 24, 2026 23:46
@zancas zancas requested review from idky137 and removed request for nachog00 April 27, 2026 15:10
@zancas
Copy link
Copy Markdown
Member Author

zancas commented Apr 27, 2026

@idky137 I have not carefully evaluated the claims in "C", I would appreciate your insights worker threads management at that layer.

let tmp = tempfile::tempdir().unwrap();
let env = Arc::new(
lmdb::Environment::new()
.set_map_size(1 << 20)
Copy link
Copy Markdown
Member Author

@zancas zancas Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@idky137 I don't understand why this is necessary.

@idky137 idky137 assigned idky137 and unassigned nachog00 and idky137 Apr 27, 2026
@idky137
Copy link
Copy Markdown
Contributor

idky137 commented Apr 27, 2026

utACK modulo one architectural concern for the follow-up discussion.

This refactor looks fine for the current LMDB-backed DbV0/DbV1 implementations, and I do not think it blocks the planned graph-based migration work.

However, one thing worth clearing up is that DbLifecycle now bakes LMDB-specific details into the shared DbVx lifecycle abstraction via env() and clean_trailing(). The versioned DB model was intended to allow each DbVx to choose its own underlying storage implementation — LMDB today, but potentially RocksDB or a future Rust-native store later. This trait moves us slightly further toward “every DbVx has an LMDB Environment”.

Not a blocker for this PR, but before this abstraction spreads further we may want to make a small adjustment so the generic lifecycle remains storage-agnostic, and LMDB-specific cleanup/sync behaviour is provided by the LMDB-backed implementations only (Enabling moving to a pure rust DB impl when one is available in the future).

@zancas zancas self-assigned this Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

3 participants