Skip to content

fabtests: Add comment clarifying loop invariant in ft_spin_for_comp#12224

Open
sunkuamzn wants to merge 1 commit intoofiwg:mainfrom
sunkuamzn:spin-comp-underflow
Open

fabtests: Add comment clarifying loop invariant in ft_spin_for_comp#12224
sunkuamzn wants to merge 1 commit intoofiwg:mainfrom
sunkuamzn:spin-comp-underflow

Conversation

@sunkuamzn
Copy link
Copy Markdown
Contributor

@sunkuamzn sunkuamzn commented May 8, 2026

The do...while loop in ft_spin_for_comp use unsigned subtraction
(total - *cur > 0) as the loop condition. This is correct as long as
callers maintain the invariant *cur <= total. Document this requirement
with a comment.

@sunkuamzn sunkuamzn requested review from a team, aingerson and j-xiong May 8, 2026 00:40
charlesstoll
charlesstoll previously approved these changes May 8, 2026
@sunkuamzn
Copy link
Copy Markdown
Contributor Author

bot:aws:retest

j-xiong
j-xiong previously approved these changes May 8, 2026
@sunkuamzn
Copy link
Copy Markdown
Contributor Author

@j-xiong is Intel CI failure related?

@j-xiong
Copy link
Copy Markdown
Contributor

j-xiong commented May 8, 2026

shm test kept timing out. Could be an infrastructure issue.

@zachdworkin
Copy link
Copy Markdown
Contributor

shm test kept timing out. Could be an infrastructure issue.

I rebooted the problematic node however, there did not seem to be any issues with it. I will re-run this PR when its back up

@alekswn
Copy link
Copy Markdown
Contributor

alekswn commented May 8, 2026

I don't think underflow should be logically possible in these loops. total is fixed number and *cur only increments by 1 in the loop.
If it overflows there might be some other bug.

@alekswn
Copy link
Copy Markdown
Contributor

alekswn commented May 8, 2026

Might be ft_read_cq was called with total < *cur?

@zachdworkin
Copy link
Copy Markdown
Contributor

Seems like it was some weird node issue. I thought there was some bad cleanup of /dev/shm from an earlier run but that location was empty. I don't know why rebooting it fixed the issue but the problematic test just passed. The full run should finish in another 15-20 min

The do...while loop in ft_spin_for_comp use unsigned subtraction
(total - *cur > 0) as the loop condition. This is correct as long as
callers maintain the invariant *cur <= total. Document this requirement
with a comment.

Signed-off-by: Sai Sunku <sunkusa@amazon.com>
@sunkuamzn sunkuamzn dismissed stale reviews from j-xiong and charlesstoll via c539ef0 May 9, 2026 01:14
@sunkuamzn sunkuamzn force-pushed the spin-comp-underflow branch from 816416a to c539ef0 Compare May 9, 2026 01:14
@sunkuamzn
Copy link
Copy Markdown
Contributor Author

@alekswn is right. The bug I was chasing was someplace else. I modified the commit to add a comment in case others run into a similar issue in the future.

@sunkuamzn sunkuamzn changed the title fabtests: Do not subtract unsigned integers and compare with 0 fabtests: Add comment clarifying loop invariant in ft_spin_for_comp May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants