Fix/issue 817 idle timeout log level#824
Open
lutz-grex wants to merge 2 commits intomodelcontextprotocol:mainfrom
Open
Fix/issue 817 idle timeout log level#824lutz-grex wants to merge 2 commits intomodelcontextprotocol:mainfrom
lutz-grex wants to merge 2 commits intomodelcontextprotocol:mainfrom
Conversation
Idle keep-alive timeout is normal zombie-session cleanup, not a transport failure. Route it through a dedicated WorkerQuitReason::IdleTimeout variant. Log it at debug level instead of treating it as a fatal error. Remove the unused LocalSessionWorkerError::KeepAliveTimeout variant. Closes modelcontextprotocol#817
Swallow SessionServiceTerminated in close_session when the worker has already exited. This prevents a spurious ERROR log during the post-exit cleanup path in spawn_session_worker.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Idle keep-alive timeout is normal zombie-session cleanup, not a transport failure. This PR stops it from producing
ERROR-level log lines that are indistinguishable from real transport fatals.Motivation and Context
When
LocalSessionWorkerhits its configured keep-alive timeout (default 5 min), it returnsWorkerQuitReason::Fatal, which the transport worker logs attracing::error!. TheSessionConfig::keep_alivedocstring explicitlyframes this as a safety net for silently-dropped HTTP connections — normal lifecycle cleanup, not an error condition.
This causes every idle reap to emit:
Operators who wire ERROR-level alerts get paged on something that is normal by design. Downstream consumers cannot filter idle reaps without also silencing real transport fatals.
Additionally, the post-exit cleanup path in
close_sessioncould fail withSessionServiceTerminatedwhen the worker had already exited due to idle timeout, producing a second spurious error.Closes #817
How Has This Been Tested?
test_keep_alive_timeout_does_not_emit_error_log) that spins up a Streamable HTTP server with a 200ms keep-alive, connects a client, waits for idle reap, and asserts: no ERROR-level logs areemitted, and a DEBUG log with
IdleTimeoutis present.test_explicit_close_on_live_session_succeeds) that verifiesclose_sessionon a still-alive worker succeeds without error.cargo test).Breaking Changes
None.
WorkerQuitReasonis#[non_exhaustive], so the newIdleTimeoutvariant is additive. The removedLocalSessionWorkerError::KeepAliveTimeoutvariant was not part of the public API.Types of changes
Checklist
Additional context
This implements Option A from #817: a dedicated
WorkerQuitReason::IdleTimeout(Duration)variant that is logged atdebuglevel alongside the other expected-exit arms (Cancelled,TransportClosed,HandlerTerminated). Truetransport failures continue to flow through
Fataland keep their ERROR severity.Changes across two commits:
d41d3c8— AddWorkerQuitReason::IdleTimeout, return it from the keep-alive arm, log at debug. Remove unusedKeepAliveTimeouterror variant.2463369— SwallowSessionServiceTerminatedinclose_sessionwhen the worker has already exited, preventing a spurious error during post-exit cleanup.