Conversation
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 191a544bb8
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 81ca7fc939
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| struct pollfd pfd = {socket->s, POLLIN | POLLOUT, 0}; | ||
| int ret = poll(&pfd, 1, 0); | ||
| if (ret >= 0 && !(pfd.revents & (POLLERR | POLLHUP | POLLNVAL))) { |
There was a problem hiding this comment.
Reject peer-closed sockets before returning from pool
This liveness check treats a socket as healthy whenever poll() has no POLLERR|POLLHUP|POLLNVAL, but a graceful peer close is commonly reported as readable EOF (POLLIN, and POLLRDHUP only if requested) without POLLHUP. In that case this code returns a dead pooled socket, so callers still hit a failure on first I/O and the stale-socket issue remains. Please include POLLRDHUP in requested/checked events or perform an EOF probe (for example recv(..., MSG_PEEK)) before reusing the socket.
Useful? React with 👍 / 👎.
| // Verify the socket is still alive before returning it. | ||
| if (socket->state.load() != STCPManager::Socket::CLOSED) { | ||
| struct pollfd pfd = {socket->s, POLLIN | POLLOUT, 0}; | ||
| int ret = poll(&pfd, 1, 0); |
There was a problem hiding this comment.
Retry interrupted poll instead of discarding socket
If poll() is interrupted by a signal (ret == -1 with errno == EINTR), this path falls through to stale-socket discard, even though the socket may be perfectly valid. In environments with periodic signals (profilers, timers), this can cause avoidable connection churn and extra reconnect latency. Treat EINTR as retryable here rather than marking the socket dead.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
The chances of this happening are so low and even if it happens the cost is just opening a new socket, I don't think this matters.
Details
This uses
poll()to check if a socket has died when removing it from the pool, returning a different one if it has.It was previously possible for a socket to die while sitting in the pool unused until we tried to write to it, at which point we'd notice the error. This checks for errors before returning the socket to the caller.
Note that it's still possible for a socket to die at any time, including after we have done this check but before the caller acts on the socket, but the exposure is now greatly reduced.
Fixed Issues
Fixes https://github.com/Expensify/Expensify/issues/613505#event-24230562693
Tests
Internal Testing Reminder: when changing bedrock, please compile auth against your new changes