Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .changeset/fix-requestsnapshot-publish-ordering.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@electric-sql/client': patch
---

Fix `requestSnapshot()` so it resolves only after the injected snapshot batch has been delivered to subscribers, including async and reentrant subscriber paths.
31 changes: 31 additions & 0 deletions docs/requestsnapshot-pr-comment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
## PR comment draft

I narrowed the reentrant publish bypass so it only applies to snapshot injection from `requestSnapshot()`, rather than to any nested publish while `#isPublishing` is true.

Concretely:

- `#onMessages(...)` and `#publish(...)` now accept an internal opt-in flag for reentrant bypass
- ordinary stream batches, including SSE batches, still serialize through `#messageChain`
- only `requestSnapshot()` calls `#onMessages(..., { allowReentrantPublishBypass: true })`

This preserves the original deadlock fix:

- if a subscriber handling batch `M1` does `await requestSnapshot(...)`, the injected snapshot batch can still publish immediately instead of being queued behind the subscriber that is awaiting it

But it avoids the broader regression:

- later SSE batches no longer bypass earlier in-flight publishes just because `#isPublishing === true`

I added tests for both sides of the behavior:

1. **SSE ordering regression test**
- proves a later SSE batch is not delivered before subscriber callbacks for the earlier batch complete

2. **Bystander subscriber behavior test**
- explicitly documents current behavior that a reentrant `requestSnapshot()` may re-enter bystander subscribers before their earlier callback completes

So the resulting contract is:

- **ordinary stream traffic remains serialized**
- **snapshot injection is the only allowed reentrant bypass**
- **bystander reentrancy during snapshot injection remains allowed and now has test coverage**
50 changes: 40 additions & 10 deletions packages/typescript-client/src/client.ts
Original file line number Diff line number Diff line change
Expand Up @@ -609,6 +609,12 @@ export class ShapeStream<T extends Row<unknown> = Row>
#tickPromiseResolver?: () => void
#tickPromiseRejecter?: (reason?: unknown) => void
#messageChain = Promise.resolve<void[]>([]) // promise chain for incoming messages
// Tracks when subscriber callbacks are actively being delivered from
// #messageChain. requestSnapshot can inject a nested batch from inside a
// subscriber; in that reentrant case #publish uses this as an intentional
// escape hatch to deliver the nested snapshot batch immediately rather than
// queueing it behind the subscriber that is awaiting it.
Comment on lines +612 to +616
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// Tracks when subscriber callbacks are actively being delivered from
// #messageChain. requestSnapshot can inject a nested batch from inside a
// subscriber; in that reentrant case #publish uses this as an intentional
// escape hatch to deliver the nested snapshot batch immediately rather than
// queueing it behind the subscriber that is awaiting it.

I think we should try to avoid agent comment-creep - the flag's function is fairly clear with the comment left when used in code further down

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bump on this

#isPublishing = false
#snapshotTracker = new SnapshotTracker()
#pauseLock: PauseLock
#currentFetchUrl?: URL // Current fetch URL for computing shape key
Expand Down Expand Up @@ -1374,7 +1380,11 @@ export class ShapeStream<T extends Row<unknown> = Row>
return true
}

async #onMessages(batch: Array<Message<T>>, isSseMessage = false) {
async #onMessages(
batch: Array<Message<T>>,
isSseMessage = false,
opts: { allowReentrantPublishBypass?: boolean } = {}
) {
if (!Array.isArray(batch)) {
console.warn(
`[Electric] #onMessages called with non-array argument (${typeof batch}). ` +
Expand Down Expand Up @@ -1426,7 +1436,9 @@ export class ShapeStream<T extends Row<unknown> = Row>
return true // Always process control messages
})

await this.#publish(messagesToProcess)
await this.#publish(messagesToProcess, {
allowReentrantBypass: opts.allowReentrantPublishBypass,
Comment thread
icehaunter marked this conversation as resolved.
})
}

/**
Expand Down Expand Up @@ -1718,12 +1730,11 @@ export class ShapeStream<T extends Row<unknown> = Row>
}
}

async #publish(messages: Message<T>[]): Promise<void[]> {
// We process messages asynchronously
// but SSE's `onmessage` handler is synchronous.
// We use a promise chain to ensure that the handlers
// execute sequentially in the order the messages were received.
this.#messageChain = this.#messageChain.then(() =>
async #publish(
messages: Message<T>[],
opts: { allowReentrantBypass?: boolean } = {}
): Promise<void[]> {
const deliver = () =>
Promise.all(
Array.from(this.#subscribers.values()).map(async ([callback, __]) => {
try {
Expand All @@ -1735,7 +1746,24 @@ export class ShapeStream<T extends Row<unknown> = Row>
}
})
)
)

// We process messages asynchronously but SSE's `onmessage` handler is
// synchronous. Use a promise chain to ensure handlers execute sequentially
// in the order messages were received. Only requestSnapshot's injected
// snapshot batch is allowed to bypass the queue reentrantly; ordinary
// stream batches (including SSE batches) must remain serialized.
if (this.#isPublishing && opts.allowReentrantBypass) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we even need this whole isPublishing mechanism anymore if we have the opts flag? Why not just have the flag dictate whether the publishing needs to be queued or not - the snapshot can set the flag as not needing to be queued, and it always just publishes out of band. Makes everything simpler - am I missing something?

return deliver()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we call deliver() without awaiting i think we can still get "wrong" execution order.
Concrete scenario with two subscribers A and B:

  1. Batch M1 arrives. #isPublishing flips to true, outer deliver() starts.
  2. Promise.all spins up callbackA(M1) and callbackB(M1) concurrently. Both run up to their first await.
  3. A awaits requestSnapshot(). B is still mid-await processing M1 (e.g. an async write, a fetch, awaiting downstream state).
  4. requestSnapshot() calls #onMessages(snapshot)#publish(snapshot) → reentrant branch fires → deliver() invokes both callbackA(snapshot) and callbackB(snapshot) immediately.
  5. B's callback is now re-entered with the snapshot batch before its M1 invocation has returned.

So the per-subscriber invariant "my callback won't be re-entered while a previous invocation is in flight" is broken for any bystander subscriber whose M1 callback is still awaiting something when the reentrant publish fires. The messageChain-level ordering is preserved (next queued batch still waits), but B can observe M1-start → snapshot-start → snapshot-end → M1-end interleaving in its own callback.

In practice this may be fine — if subscribers are effectively stateless across awaits, the window is invisible. But a subscriber that mutates this.something across an await in its callback could see corruption.

}

this.#messageChain = this.#messageChain.then(async () => {
this.#isPublishing = true
try {
return await deliver()
} finally {
this.#isPublishing = false
}
})

return this.#messageChain
}
Expand Down Expand Up @@ -1901,7 +1929,9 @@ export class ShapeStream<T extends Row<unknown> = Row>
metadata,
new Set(data.map((message) => message.key))
)
this.#onMessages(dataWithEndBoundary, false)
await this.#onMessages(dataWithEndBoundary, false, {
allowReentrantPublishBypass: true,
})

// On cold start the stream's offset is still at "now". Advance it
// to the snapshot's position so no updates are missed in between.
Expand Down
9 changes: 5 additions & 4 deletions packages/typescript-client/test/client.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -1694,10 +1694,11 @@ describe.for(fetchAndSse)(
limit: 100,
})

// Wait until shape reflects the snapshot
await vi.waitFor(() => {
expect(shape.currentRows.length).toBe(data.length)
})
// requestSnapshot must not resolve until subscriber callbacks for the
// injected snapshot batch have completed. Callers such as TanStack DB's
// on-demand loadSubset rely on this to make immediate reads after await
// consistent.
expect(shape.currentRows.length).toBe(data.length)

// Compare keys in stream vs returned snapshot data
const returnedKeys = new Set(data.map((m) => m.key))
Expand Down
Loading
Loading