[3/n][guardian-integration] add local limiter (synced with guardian) to hashi nodes#484
Merged
0xsiddharthks merged 8 commits intomainfrom Apr 28, 2026
Merged
Conversation
56510fb to
1bb33df
Compare
178b639 to
b3a5c1b
Compare
1bb33df to
f4fbdb6
Compare
b3a5c1b to
f2e3300
Compare
96f3774 to
49100e6
Compare
f2e3300 to
f826f4b
Compare
f826f4b to
7ef840d
Compare
bmwill
approved these changes
Apr 28, 2026
| optional uint64 next_seq = 3; | ||
| } | ||
|
|
||
| // Immutable limiter configuration. |
Contributor
There was a problem hiding this comment.
This isn't exactly immutable since the guardian can opt to change it
67c9469 to
dffff49
Compare
Adds a `LocalLimiter` that mirrors the guardian's token-bucket state so each hashi node can project capacity and pick the next `seq` without a round-trip. Observational in this PR — no withdrawal-flow change yet; the rewiring lands in the next PR. - `crates/hashi/src/guardian_limiter.rs` — `LocalLimiter` with `validate_consume` / `apply_consume` / `capacity_at` / `snap_to` under a `tokio::Mutex`, plus 8 unit tests. - `Hashi::start` — fetch `GetGuardianInfo` once at startup, cache the Ed25519 signing pubkey, seed the `LocalLimiter`. Best-effort; the node still starts when the guardian is unreachable. - `start_limiter_reconcile_service` — 30 s background task. Covers late bootstrap, non-leader drift, and leader rotation. Snap only when the guardian is strictly ahead (`next_seq` or `last_updated_at` strictly greater) to avoid moving local state backwards on a stale guardian snapshot. Also refreshes `guardian_signing_pubkey` if bootstrap missed it. - `GetGuardianInfoResponse` extended with `LimiterState` + `LimiterConfig` so one RPC seeds the full emulator.
In the simplest case the 30 s reconciliation loop has nothing to do — the leader stays in sync via `apply_consume` after every successful guardian RPC, the next PR's error-path snaps from the guardian unconditionally on any RPC rejection, and the wasted-MPC-round window at leader promotion is better covered by an on-demand snap at the false→true leader edge (landing alongside the withdrawal wiring). Keep the bootstrap half of the sync story; replace the recurring reconcile with a short-lived retry. - `try_seed_guardian_state` — idempotent helper that writes each field at most once (`OnceLock::set` for the pubkey, `is_none()` guard on the `local_limiter` slot). Called synchronously in `Hashi::start` so the happy path is seeded before any other service starts. - `start_guardian_bootstrap_retry` — short-lived task that fast-exits when no guardian is configured or state is already seeded, else retries with bounded exponential backoff (1 s → 30 s) until the first success, then returns. - Delete `start_limiter_reconcile_service` + `install_local_limiter`. The reconcile's "strictly ahead" clause had a dead branch anyway: `RateLimiter::consume` moves `last_updated_at` and `next_seq` in lockstep, and `revert` rolls both back, so they cannot diverge. No proto, guardian-side, or `guardian_limiter.rs` changes. Non-guardian baseline is untouched.
The two-variant enum was a glorified bool — the retry loop only checked for FullySeeded. Switch to bool to drop the indirection.
The synchronous seed in start() was best-effort (return ignored) and duplicated the retry task. Drop it and let the background task own seeding, with the first attempt firing at delay = 0. No caller relies on guardian state being populated before start() returns.
Replace the hand-rolled exponential-backoff loop with backon's ExponentialBuilder + Retryable, matching the pattern already used in crates/hashi/src/communication/timeout_and_retry.rs. Also bump the GetGuardianInfo failure log from debug to warn so operators see unreachable-guardian failures at the default log level.
Match the file's convention of short or no doc comments. Function names and bodies carry the meaning here.
snap_to was a recovery primitive the new design doesn't need — the local limiter is bootstrapped from the guardian at startup and advanced only on accepted consumes, with no force-resync path. Drop the function, its test, and the references in module and apply_consume docs. Also trim docstrings that just restated method names and inline test arithmetic that re-derived constants from the test setup.
Each committee member will validate against a leader-supplied seq at MPC signing time, so `validate_consume` becomes a pure check rather than a peek-and-return. Returns `()` on success; the seq mismatch case shares the existing `SeqMismatch` variant (with `local`/`incoming` field names that read naturally for both validate and apply).
7a8f7bf to
4bf13f8
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds an local emulator of the guardian's token bucket (and sequence).
LocalLimiter(crates/hashi/src/guardian_limiter.rs)GetGuardianInfoResponseextended withLimiterState+LimiterConfigWe are not currently updating the local limiter when processing new transactions. that is being done in #495