Skip to content

Fix race in AsyncPollerTest.testSuspendPolling#2848

Draft
donald-pinckney wants to merge 1 commit intomasterfrom
d/fix-flaky
Draft

Fix race in AsyncPollerTest.testSuspendPolling#2848
donald-pinckney wants to merge 1 commit intomasterfrom
d/fix-flaky

Conversation

@donald-pinckney
Copy link
Copy Markdown
Contributor

What was changed

Remove the race: override reserveSlot on the slot supplier to count invocations, and wait for reserveCalls == 2 before suspending. That's a direct signal that iter 2 has passed the suspend check and called reserveSlot (its future can't complete until iter 1's slot is released, but the call itself happens immediately). After this signal, iter 2 is pinned past the suspend check, so the subsequent suspendPolling() only affects iter 3+.

Also drop the now-redundant first assertEventually: once pollLatch counts down (inside the mocked poll()), iter 1's slot has already been issued, so reservedCount == 1 is synchronously visible.

Why?

The test relied on the poller's second iteration racing past its suspend-latch check before the test thread called suspendPolling(). If the test thread won the race, iter 2 blocked at the suspend latch and never reached reserveSlot — so reservedCount stayed at 1 and the later expected:<2> but was:<1> assertion failed. Bumping the assertEventually timeout didn't help; the race either wins or loses deterministically on a given run.

AI Transparency

This PR was fully generated by Claude, and I have not yet reviewed it for accuracy with regards to fixing the flakiness. Don't yet review until I review it more.

The test relied on the poller's second iteration racing past its
suspend-latch check before the test thread called suspendPolling(). If
the test thread won the race, iter 2 blocked at the suspend latch and
never reached reserveSlot — so reservedCount stayed at 1 and the later
`expected:<2> but was:<1>` assertion failed. Bumping the assertEventually
timeout didn't help; the race either wins or loses deterministically on
a given run.

Remove the race: override reserveSlot on the slot supplier to count
invocations, and wait for reserveCalls == 2 before suspending. That's a
direct signal that iter 2 has passed the suspend check and called
reserveSlot (its future can't complete until iter 1's slot is released,
but the call itself happens immediately). After this signal, iter 2 is
pinned past the suspend check, so the subsequent suspendPolling() only
affects iter 3+.

Also drop the now-redundant first assertEventually: once pollLatch
counts down (inside the mocked poll()), iter 1's slot has already been
issued, so reservedCount == 1 is synchronously visible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant