Skip to content

feat: SSE keepalive for chat completions streaming + fix cancellation propagation#1745

Open
dale-lakes wants to merge 2 commits intoexo-explore:mainfrom
dale-lakes:feat/sse-keepalive-and-cancellation-fix
Open

feat: SSE keepalive for chat completions streaming + fix cancellation propagation#1745
dale-lakes wants to merge 2 commits intoexo-explore:mainfrom
dale-lakes:feat/sse-keepalive-and-cancellation-fix

Conversation

@dale-lakes
Copy link
Copy Markdown

@dale-lakes dale-lakes commented Mar 16, 2026

Summary

Two related fixes for the OpenAI chat completions SSE streaming endpoint:

1. SSE keepalive during silent streaming periods

During thinking phases, the model generates tokens but the parser consumes them — no data flows on the SSE stream. HTTP connections can drop after prolonged silence.

Wraps the chat completions SSE output with sse_with_keepalive() which sends SSE comments (: keepalive\n\n) every 15 seconds when no data chunks are available. SSE comments are part of the spec and are ignored by compliant clients.

2. Catch asyncio.CancelledError in stream cleanup

_token_chunk_stream only caught anyio.get_cancelled_exc_class() for cancellation cleanup. Since the keepalive wrapper uses asyncio.create_task, client disconnects arrive as asyncio.CancelledError, bypassing the cleanup that sends TaskCancelled to the worker. This left the worker generating indefinitely after the client disconnected.

Now catches both anyio.get_cancelled_exc_class() and asyncio.CancelledError.

Test plan

  • SSE keepalive comments sent during long thinking phases
  • Worker stops generating within seconds of client disconnect
  • Normal streaming unaffected
  • Stream terminates cleanly with data: [DONE] on normal completion

🤖 Generated with Claude Code

dale-lakes and others added 2 commits March 16, 2026 17:35
anyio task groups can't be used with yield (crossing task boundaries).
Instead, wrap the final SSE string stream at the StreamingResponse
level using a plain asyncio queue + task pattern.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The sse_with_keepalive wrapper uses asyncio.create_task for its
producer. When the HTTP client disconnects, the producer task gets
cancelled with asyncio.CancelledError. But _token_chunk_stream only
catches anyio's cancellation exception, so it never sends
TaskCancelled to the worker — leaving the model generating
indefinitely after the client is gone.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dale-lakes dale-lakes changed the title feat: SSE keepalive during streaming + fix cancellation propagation feat: SSE keepalive for chat completions streaming + fix cancellation propagation Mar 17, 2026
@dale-lakes dale-lakes marked this pull request as ready for review March 17, 2026 19:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant