fix(llmobs): capture reasoning_content from streamed chat completions#18274
Conversation
The streamed-chunk aggregator never read delta.reasoning_content, so OpenAI-compatible reasoning providers (DeepSeek, Qwen, etc.) had their reasoning text silently dropped on the LLM Obs span while reasoning_output_tokens was still reported. Add the missing accumulation; downstream openai_set_meta_tags_from_chat already emits the role: "reasoning" output message when the key is present. Fixes #18257 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Codeowners resolved as |
|
Address review feedback: initialize reasoning_content as empty string and pop if still empty at the end, matching how tool_calls is handled in the same function. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Match the chunk_content pattern on the adjacent line. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
/merge |
|
View all feedbacks in Devflow UI.
The expected merge time in
Tests failed on this commit b0b5498: What to do next?
|
|
This change is marked for backport to 4.9 and it does not conflict with that branch. |
|
This change is marked for backport to 4.8 and it does not conflict with that branch. |
|
/merge |
|
View all feedbacks in Devflow UI.
This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
The expected merge time in
|
…#18274) ## Summary Fixes #18257. The OpenAI + Litellm streamed-chunk aggregator `openai_construct_message_from_streamed_chunks` never read `delta.reasoning_content` from streamed chunks, so OpenAI-compatible reasoning providers (DeepSeek-V3/V4, Qwen reasoning models on Baseten/Fireworks, etc.) had their reasoning text silently dropped on the LLM Obs span. Both the OpenAI and LiteLLM integrations call this aggregator (`ddtrace/contrib/internal/openai/utils.py:151`, `ddtrace/contrib/internal/litellm/utils.py:55`) for streamed chat responses and pass the result straight to `openai_set_meta_tags_from_chat`, which already checks for `reasoning_content` messages but reasoning content messages were never constructed from the streamed response. The fix checks for and accumulates `delta.reasoning_content` message chunks. ## Notes for reviewers - The OpenAI Python SDK does not declare `reasoning_content` as a typed field on `ChoiceDelta`, but its `BaseModel` is configured with `extra="allow"` so the field passes through as an attribute when emitted by an OpenAI-compatible provider. LiteLLM's `Delta` type exposes `reasoning_content` directly. - Avoiding an E2E regression test for now because I don't have a deepseek API key 😢 but the unit tests should be sufficient to get this fix out. Claude session: `948c6399-4afc-4ad8-acfc-d05db310902d` Resume: `claude --resume 948c6399-4afc-4ad8-acfc-d05db310902d` Co-authored-by: yun.kim <yun.kim@datadoghq.com> (cherry picked from commit b4420fa) Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
…#18274) ## Summary Fixes #18257. The OpenAI + Litellm streamed-chunk aggregator `openai_construct_message_from_streamed_chunks` never read `delta.reasoning_content` from streamed chunks, so OpenAI-compatible reasoning providers (DeepSeek-V3/V4, Qwen reasoning models on Baseten/Fireworks, etc.) had their reasoning text silently dropped on the LLM Obs span. Both the OpenAI and LiteLLM integrations call this aggregator (`ddtrace/contrib/internal/openai/utils.py:151`, `ddtrace/contrib/internal/litellm/utils.py:55`) for streamed chat responses and pass the result straight to `openai_set_meta_tags_from_chat`, which already checks for `reasoning_content` messages but reasoning content messages were never constructed from the streamed response. The fix checks for and accumulates `delta.reasoning_content` message chunks. ## Notes for reviewers - The OpenAI Python SDK does not declare `reasoning_content` as a typed field on `ChoiceDelta`, but its `BaseModel` is configured with `extra="allow"` so the field passes through as an attribute when emitted by an OpenAI-compatible provider. LiteLLM's `Delta` type exposes `reasoning_content` directly. - Avoiding an E2E regression test for now because I don't have a deepseek API key 😢 but the unit tests should be sufficient to get this fix out. Claude session: `948c6399-4afc-4ad8-acfc-d05db310902d` Resume: `claude --resume 948c6399-4afc-4ad8-acfc-d05db310902d` Co-authored-by: yun.kim <yun.kim@datadoghq.com> (cherry picked from commit b4420fa) Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
Summary
Fixes #18257.
The OpenAI + Litellm streamed-chunk aggregator
openai_construct_message_from_streamed_chunksnever readdelta.reasoning_contentfrom streamed chunks, so OpenAI-compatible reasoning providers (DeepSeek-V3/V4, Qwen reasoning models on Baseten/Fireworks, etc.) had their reasoning text silently dropped on the LLM Obs span.Both the OpenAI and LiteLLM integrations call this aggregator (
ddtrace/contrib/internal/openai/utils.py:151,ddtrace/contrib/internal/litellm/utils.py:55) for streamed chat responses and pass the result straight toopenai_set_meta_tags_from_chat, which already checks forreasoning_contentmessages but reasoning content messages were never constructed from the streamed response.The fix checks for and accumulates
delta.reasoning_contentmessage chunks.Notes for reviewers
reasoning_contentas a typed field onChoiceDelta, but itsBaseModelis configured withextra="allow"so the field passes through as an attribute when emitted by an OpenAI-compatible provider. LiteLLM'sDeltatype exposesreasoning_contentdirectly.Claude session:
948c6399-4afc-4ad8-acfc-d05db310902dResume:
claude --resume 948c6399-4afc-4ad8-acfc-d05db310902d