Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .agent/notes/counter-poll-audit-core.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ Searched `rivetkit-rust/packages/rivetkit-core/src/` for:

- `actor/sleep.rs::wait_for_shutdown_tasks`
- Classification: event-driven.
- Uses `AsyncCounter::wait_zero(deadline)` for shutdown tasks and websocket callbacks, plus `prevent_sleep_notify` for the prevent-sleep flag.
- Uses `AsyncCounter::wait_zero(deadline)` for shutdown tasks and websocket callbacks, plus `keep_awake_notify` for the keep-awake flag.

- `actor/sleep.rs::wait_for_internal_keep_awake_idle`
- Classification: event-driven.
Expand Down
2 changes: 1 addition & 1 deletion .agent/notes/ralph-prd-review-state.json
Original file line number Diff line number Diff line change
Expand Up @@ -436,7 +436,7 @@
"commit": "880e45207",
"verdict": "PASS",
"medCritIssues": [],
"notes": "Main payoff story. AsyncCounter gained register_zero_notify fan-out (Weak<Notify> observer pattern). idle_notify is pinged by keep_awake + internal_keep_awake + http_request_counter via register_zero_notify in WorkRegistry::new + lookup_http_request_counter. wait_for_shutdown_tasks composes shutdown_counter + websocket_callback + prevent_sleep_notify via tokio::select!. set_prevent_sleep calls notify_prevent_sleep_changed only on flip. Old AtomicUsize shims removed; can_sleep reads from WorkRegistry. Two deterministic tests use tokio::test(start_paused=true). Grep confirms zero Duration::from_millis(10) remain in sleep.rs. Minor: lookup_http_request_counter re-registers on every miss (bounded by envoy reconfigures, Weak refs prevent leak)."
"notes": "Main payoff story. AsyncCounter gained register_zero_notify fan-out (Weak<Notify> observer pattern). idle_notify is pinged by keep_awake + internal_keep_awake + http_request_counter via register_zero_notify in WorkRegistry::new + lookup_http_request_counter. wait_for_shutdown_tasks composes shutdown_counter + websocket_callback + idle notification via tokio::select!. keep_awake regions notify when their counters change. Old AtomicUsize shims removed; can_sleep reads from WorkRegistry. Two deterministic tests use tokio::test(start_paused=true). Grep confirms zero Duration::from_millis(10) remain in sleep.rs. Minor: lookup_http_request_counter re-registers on every miss (bounded by envoy reconfigures, Weak refs prevent leak)."
},
"US-007": {
"commit": "13d606e31",
Expand Down
2 changes: 1 addition & 1 deletion .agent/notes/rivetkit-core-walkthrough.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ When an inspector attaches, `inspector_attach_count` increments. Every 50ms, if

## Chapter 8 — Sleep (Two Phases)

Sleep is triggered by the idle timer. When `sleep_deadline` fires, `SleepController::can_sleep` checks: not-ready, `prevent_sleep`, `no_sleep` config, active HTTP, keep-awake regions, non-hibernatable connections, or outstanding WebSocket callbacks all block it.
Sleep is triggered by the idle timer. When `sleep_deadline` fires, `SleepController::can_sleep` checks: not-ready, `keep_awake`, `no_sleep` config, active HTTP, keep-awake regions, non-hibernatable connections, or outstanding WebSocket callbacks all block it.

### Phase 1: SleepGrace

Expand Down
2 changes: 1 addition & 1 deletion .agent/notes/sleep-grace-abort-run-wait.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ Located in `rivetkit-rust/packages/rivetkit-core/src/actor/task.rs`:
`mark_destroy_requested` at `actor/context.rs:466` (the destroy path).
- `wait_for_sleep_idle_window` polls `ActorContext::can_sleep_state`
(`actor/sleep.rs::229-260`). `can_sleep_state` checks ready/started,
`prevent_sleep`, `no_sleep`, `active_http_request_count`,
`keep_awake`, `no_sleep`, `active_http_request_count`,
`sleep_keep_awake_count`, `sleep_internal_keep_awake_count`,
`pending_disconnect_count`, non-empty conns, and
`websocket_callback_count`. **It does NOT check whether the `run_handle`
Expand Down
122 changes: 61 additions & 61 deletions .agent/notes/us120-postfix/run1.log

Large diffs are not rendered by default.

122 changes: 61 additions & 61 deletions .agent/notes/us120-postfix/run2.log

Large diffs are not rendered by default.

120 changes: 60 additions & 60 deletions .agent/notes/us120-postfix/run3.log

Large diffs are not rendered by default.

120 changes: 60 additions & 60 deletions .agent/notes/us120-postfix/run4.log

Large diffs are not rendered by default.

108 changes: 54 additions & 54 deletions .agent/notes/us120-postfix/run5.log

Large diffs are not rendered by default.

120 changes: 60 additions & 60 deletions .agent/notes/us120-repro/run1.log

Large diffs are not rendered by default.

120 changes: 60 additions & 60 deletions .agent/notes/us120-repro/run2.log

Large diffs are not rendered by default.

120 changes: 60 additions & 60 deletions .agent/notes/us120-repro/run3.log

Large diffs are not rendered by default.

120 changes: 60 additions & 60 deletions .agent/notes/us120-repro/run4.log

Large diffs are not rendered by default.

108 changes: 54 additions & 54 deletions .agent/notes/us120-repro/run5.log

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions .agent/specs/lifecycle-shutdown-unified-drain.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ All wake sources must route through `reset_sleep_timer`. The existing `AsyncCoun

### 2. Two readiness functions

`can_sleep_state` (`sleep.rs:264-300`) today mixes concerns: readiness flags (`ready`, `started`), activity flags (`prevent_sleep`, `no_sleep`), run state (`run_handler_active_count`), drain counters, conn state. Split into two:
`can_sleep_state` (`sleep.rs:264-300`) today mixes concerns: readiness flags (`ready`, `started`), activity flags (`keep_awake`, `no_sleep`), run state (`run_handler_active_count`), drain counters, conn state. Split into two:

**`can_arm_sleep_timer() -> CanSleep`** (async, for `Started` only). Preserves existing `can_sleep_state` semantics. Used to decide whether `sleep_deadline` is armed.

Expand All @@ -74,7 +74,7 @@ All wake sources must route through `reset_sleep_timer`. The existing `AsyncCoun
- `active_http_request_count == 0`
- `websocket_callback_count == 0`
- `pending_disconnect_count == 0`
- `!prevent_sleep` (honors `lifecycle.mdx:818,746` promise)
- `!keep_awake` (honors `lifecycle.mdx:818,746` promise)

Explicitly **not** checked in `can_finalize_sleep`:
- `ready` / `started` — flipped to `false` at grace entry (§4); not relevant to drain.
Expand Down Expand Up @@ -343,7 +343,7 @@ Every site that mutates an input of `can_arm_sleep_timer` or `can_finalize_sleep

- All four drain counters' increment/decrement sites. Ensure each decrement-to-zero calls `reset_sleep_timer`. Today `AsyncCounter::register_change_notify(&activity_notify)` (`sleep.rs:615`) covers counter changes via `notify_waiters`; that wiring is replaced per §1 with a callback that invokes `reset_sleep_timer`.
- `set_ready`, `set_started` — add `reset_sleep_timer` calls (they currently don't). `transition_to` in `task.rs:2147-2167` will invoke them.
- `notify_prevent_sleep_changed` (`sleep.rs:569`) — add `reset_sleep_timer`.
- Sleep-affecting activity changes call `reset_sleep_timer`.
- `conn` add/remove — already call `reset_sleep_timer` (`context.rs:748, :755`).
- `handle_run_handle_outcome` — add `reset_sleep_timer` after `self.run_handle = None` (`task.rs:1322`).
- `ActorContext::on_state_change` callback completion — new; see 11.2.
Expand Down Expand Up @@ -377,7 +377,7 @@ Before committing §10's unbounded-channel change, enumerate everything that cur
>
> The entire window is bounded by `sleepGracePeriod` on sleep or `onDestroyTimeout` on destroy (defaults: 15 seconds each). If it is exceeded, the actor force-aborts any remaining work and proceeds to state save anyway.

- Update options table default for `sleepGracePeriod`: "Default 15000ms. Total graceful shutdown window for hooks, waitUntil, async raw WebSocket handlers, disconnects, and waiting for `preventSleep` to clear."
- Update options table default for `sleepGracePeriod`: "Default 15000ms. Total graceful shutdown window for hooks, waitUntil, async raw WebSocket handlers, disconnects, and waiting for `keepAwake` to clear."

## Invariants (post-change)

Expand All @@ -397,7 +397,7 @@ Before committing §10's unbounded-channel change, enumerate everything that cur

Each step is independently shippable and revertable. Tests must pass before the next step starts.

**Step 1 — Unify signal primitive.** Rewrite `reset_sleep_timer` / `notify_activity_dirty` as notify-only (§1). Delete `LifecycleEvent::ActivityDirty` variant, handler, `drain_activity_dirty`, parallel arm. Add `reset_sleep_timer` calls at `set_ready`/`set_started`/`notify_prevent_sleep_changed`/`handle_run_handle_outcome` (§11.1). Replace `AsyncCounter::register_change_notify` consumer with a callback that calls `reset_sleep_timer`. Existing tests for sleep timer + activity dedup must still pass.
**Step 1 — Unify signal primitive.** Rewrite `reset_sleep_timer` / `notify_activity_dirty` as notify-only (§1). Delete `LifecycleEvent::ActivityDirty` variant, handler, `drain_activity_dirty`, parallel arm. Add `reset_sleep_timer` calls at `set_ready`/`set_started`/`handle_run_handle_outcome` (§11.1). Replace `AsyncCounter::register_change_notify` consumer with a callback that calls `reset_sleep_timer`. Existing tests for sleep timer + activity dedup must still pass.

**Step 2 — Split readiness.** Introduce `can_arm_sleep_timer` (rename of `can_sleep_state`) and new `can_finalize_sleep`. Update existing `Started`-state callers. No grace callers yet. Tests unchanged.

Expand Down Expand Up @@ -429,7 +429,7 @@ Each step is independently shippable and revertable. Tests must pass before the
- `core_counter_decrements_on_hook_completion`: verify that the completion callback decrements `core_dispatched_hooks` exactly once per event, and that grace exits via drain path only when counter reaches zero.
- `hibernatable_conn_preserved_on_sleep`: hibernatable conn's state is flushed via `pending_hibernation_updates`, `onDisconnect` NOT called.
- `hibernatable_conn_fires_ondisconnect_on_destroy`: same conn on destroy fires `onDisconnect`.
- `preventSleep_during_grace_delays_finalize`: `setPreventSleep(true)` in `onSleep` → grace waits until `setPreventSleep(false)` or deadline.
- `keepAwake_during_grace_delays_finalize`: `keepAwake(true)` in `onSleep` → grace waits until `keepAwake(false)` or deadline.
- `alarm_does_not_fire_during_grace`: scheduled alarm due during grace does not invoke user alarm handler.
- `dispatch_drained_on_grace_entry`: dispatch in inbox at Stop arrival completes as tracked work, not dropped.
- `activity_signal_dedup`: 1000 rapid `reset_sleep_timer` calls produce ≤ a few main-loop re-evaluations.
Expand Down
8 changes: 4 additions & 4 deletions .agent/specs/rivetkit-core-event-driven-drains.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ struct WorkRegistry {
shutdown_counter: Arc<AsyncCounter>,
shutdown_tasks: Mutex<JoinSet<()>>,
idle_notify: Notify, // composed: fires when any of keep_awake / internal_keep_awake / http reaches zero
prevent_sleep_notify: Notify, // pinged on every ctx.set_prevent_sleep flip
keep_awake_notify: Notify, // pinged on every ctx.keep_awake flip
}

impl SleepController {
Expand Down Expand Up @@ -188,12 +188,12 @@ The aggregate drain requires `keep_awake == 0 && internal_keep_awake == 0 && act

Simpler alternative: expose an `AsyncCounter::subscribe(Arc<Notify>)` that pipes zero-transitions to an external Notify. Both approaches work; pick whichever reads cleanest.

### `prevent_sleep` bool
### `keep_awake` bool

Rarely flipped. Two options:

1. `watch::channel<bool>` on `ActorContext`, subscribers re-check on every send.
2. Dedicated `prevent_sleep_notify: Notify` pinged on every flip.
2. Dedicated `keep_awake_notify: Notify` pinged on every flip.

Either is fine. Recommend (2) for symmetry with the other notify sites.

Expand All @@ -212,7 +212,7 @@ Audit `ctx.destroy()` at `context.rs:382-389` for consistency — it has no slee
| File:line | Function | Action |
|-----------|----------|--------|
| `actor/sleep.rs:240-258` | `wait_for_sleep_idle_window` | Replace poll loop with `idle_notify`-driven wait (composed over keep_awake, internal_keep_awake, http counters) |
| `actor/sleep.rs:260-281` | `wait_for_shutdown_tasks` | Replace with `shutdown_counter.wait_zero(deadline).await` + `websocket_callback.wait_zero` + `prevent_sleep` notify |
| `actor/sleep.rs:260-281` | `wait_for_shutdown_tasks` | Replace with `shutdown_counter.wait_zero(deadline).await` + `websocket_callback.wait_zero` + `keep_awake` notify |
| `actor/sleep.rs:283-303` | `wait_for_internal_keep_awake_idle` | Replace with `internal_keep_awake.wait_zero(deadline)` |
| `actor/sleep.rs:305-326` | `wait_for_http_requests_drained` | Replace with `envoy_handle.http_request_counter(...).wait_zero(deadline)` |
| `actor/sleep.rs:24-26` | `keep_awake_count`, `internal_keep_awake_count`, `websocket_callback_count` AtomicUsize fields | Replace with `Arc<AsyncCounter>` fields on `WorkRegistry` |
Expand Down
3 changes: 1 addition & 2 deletions .agent/specs/rivetkit-rust-typed-event-loop.md
Original file line number Diff line number Diff line change
Expand Up @@ -388,8 +388,7 @@ impl<A: Actor> Ctx<A> {
// Lifecycle signaling (envoy-visible)
pub fn sleep(&self);
pub fn destroy(&self);
pub fn set_prevent_sleep(&self, enabled: bool);
pub fn prevent_sleep(&self) -> bool;
pub pub fn keep_awake(&self) -> bool;
pub fn wait_until(&self, future: impl Future<Output = ()> + Send + 'static);

// Typed broadcast + connection enumeration
Expand Down
12 changes: 5 additions & 7 deletions .agent/specs/rivetkit-rust.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,8 +136,7 @@ impl ActorContext {
// Sleep control
fn sleep(&self); // Defers to next tick. Does NOT fire abort signal.
fn destroy(&self); // Defers to next tick. Fires abort signal immediately.
fn set_prevent_sleep(&self, prevent: bool);
fn prevent_sleep(&self) -> bool;
fn keep_awake(&self) -> bool;

// Background work tracking
fn wait_until(&self, future: impl Future<Output = ()> + Send + 'static);
Expand Down Expand Up @@ -575,8 +574,7 @@ impl<A: Actor> Ctx<A> {
fn aborted(&self) -> bool;
fn sleep(&self);
fn destroy(&self);
fn set_prevent_sleep(&self, prevent: bool);
fn prevent_sleep(&self) -> bool;
fn keep_awake(&self) -> bool;
fn wait_until(&self, future: impl Future<Output = ()> + Send + 'static);

// Typed event broadcast
Expand Down Expand Up @@ -766,7 +764,7 @@ Note: step 14 is clarification that the abort signal fires at the beginning of `
- No pending disconnect callbacks
- No active WebSocket callbacks
7. Call `on_sleep` (with remaining deadline budget).
8. Wait for shutdown tasks: `wait_until` futures, WebSocket callback futures, `prevent_sleep` to clear.
8. Wait for shutdown tasks: `wait_until` futures, WebSocket callback futures, `keep_awake` to clear.
9. Disconnect all non-hibernatable connections.
10. Wait for shutdown tasks again.
11. Save state immediately. Wait for all pending KV/SQLite writes.
Expand All @@ -793,7 +791,7 @@ Destroy does NOT wait for idle sleep window.

ALL must be true:
- `ready` AND `started`
- `prevent_sleep` is false
- `keep_awake` is false
- `no_sleep` config is false
- No active HTTP requests
- No active `keep_awake` / `internal_keep_awake` regions
Expand Down Expand Up @@ -855,7 +853,7 @@ rivetkit-rust/packages/rivetkit-core/
│ ├── lifecycle.rs # Startup + shutdown sequences (sleep + destroy)
│ ├── state.rs # State dirty tracking, throttled save, PersistedActor
│ ├── vars.rs # Vars (transient, recreated each start)
│ ├── sleep.rs # can_sleep(), auto-sleep timer, prevent_sleep, keep_awake, internal_keep_awake
│ ├── sleep.rs # can_sleep(), auto-sleep timer, keep_awake, keep_awake, internal_keep_awake
│ ├── schedule.rs # Schedule API, PersistedScheduleEvent, alarm sync, invoke_action_by_name
│ ├── action.rs # Action dispatch, timeout wrapping, on_before_action_response
│ ├── connection.rs # ConnHandle, lifecycle hooks, hibernation persistence, subscription tracking
Expand Down
8 changes: 4 additions & 4 deletions .agent/specs/rivetkit-task-architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -357,7 +357,7 @@ The explicit end state is to remove `ActorVars`, `ActorContext::vars`, `ActorCon

These TS APIs must be mirrored on `ActorContext` and are load-bearing for existing driver tests.

- `ctx.set_prevent_sleep(enabled: bool)` — toggle the `prevent_sleep` flag observed by `#canSleep` and by the `#waitShutdownTasks` drain loop. While set, the drain loop keeps looping up to the grace deadline even if every tracked counter is zero.
- `ctx.keep_awake(future)` — toggle the `keep_awake` flag observed by `#canSleep` and by the `#waitShutdownTasks` drain loop. While set, the drain loop keeps looping up to the grace deadline even if every tracked counter is zero.
- `ctx.keep_awake<F>(future: F) -> impl Future` — enter an external keep-awake region for the duration of `future`. Increments `active_async_regions.keep_awake` on entry and decrements on exit via a guard. User-facing.
- `ctx.internal_keep_awake<F>(future: F) -> impl Future` — same pattern but increments `active_async_regions.internal_keep_awake`. Subsystems (queue, websocket) use the thunk form to enter the region before user callback starts, avoiding a race where the sleep timer fires underneath newly scheduled work.
- `ctx.cancelled() -> impl Future<Output = ()>` and `ctx.is_cancelled() -> bool` — alias the existing `abort_signal()` and `is_cancelled()` surface in `context.rs` (`context.rs:142, 324, 362-367`). Do not remove the existing names; add the new names as aliases to avoid churning callers.
Expand Down Expand Up @@ -436,7 +436,7 @@ The lifetime task owns the socket loop and invokes open/message/close callbacks.
Sleep readiness stays centralized in core. It reads concurrent counters/snapshots matching the TS `#canSleep()` check (`instance/mod.ts:2497-2528` on `feat/sqlite-vfs-v2`):

- not started / not ready
- `prevent_sleep` flag
- `keep_awake` flag
- no-sleep config
- active HTTP requests
- user tasks in flight (`active_async_regions.user_task`)
Expand Down Expand Up @@ -473,7 +473,7 @@ Mirrors `instance/mod.ts:.onStop("sleep")` at `:942-1022` on `feat/sqlite-vfs-v2
6. Wait for the `run` handler to finish with `run_stop_timeout` (default 15s). Done first so `on_sleep` observes `run` already stopped.
7. Compute the shutdown task deadline = `now + effective_sleep_grace_period`.
8. Run `on_sleep` with `on_sleep_timeout`.
9. Drain tracked work until the shutdown task deadline: `preventSleep` flag must be clear AND every tracked counter must hit zero. Any newly-entered `preventSleep` region keeps the drain loop running until deadline.
9. Drain tracked work until the shutdown task deadline: `keepAwake` flag must be clear AND every tracked counter must hit zero. Any newly-entered `keepAwake` region keeps the drain loop running until deadline.
10. Persist hibernatable connections.
11. Disconnect non-hibernatable connections. Hibernatable connections stay attached so they can be re-delivered on wake.
12. Drain tracked work again (this lets WS close callbacks finish).
Expand Down Expand Up @@ -534,7 +534,7 @@ Tracked work blocks sleep and destroy until it completes or the effective sleep
- `wait_until` registrations
- `ctx.keep_awake(...)` regions
- `ctx.internal_keep_awake(...)` regions
- `prevent_sleep` flag (holds the drain loop open even if all counters are zero)
- `keep_awake` flag (holds the drain loop open even if all counters are zero)
- `on_state_change` runner task
- State saves
- SQLite cleanup
Expand Down
Loading
Loading