fix(telemetry): emit OTel-standard gen_ai.usage.cache_read.input_tokens across providers by alwayys-afk · Pull Request #1666 · 0xPlaygrounds/rig

alwayys-afk · 2026-04-24T19:35:53Z

Summary

Rename the tracing/OTel attribute gen_ai.usage.cached_tokens to gen_ai.usage.cache_read.input_tokens across every provider that emits it.
gen_ai.usage.cache_read.input_tokens is the canonical attribute name listed in the OpenTelemetry GenAI semantic-conventions registry. The previous name was non-standard and would not be picked up by OTel-aware backends.
Every instance — span declarations and span.record(...) call sites, streaming and non-streaming paths — has been updated so a single provider never emits two different attribute names.

Providers touched

openai (chat + responses_api, streaming + non-streaming), openai chat-completions-compatible shared helper, anthropic (via the existing cached_input_tokens plumbing), azure, chatgpt, cohere, copilot, deepseek, galadriel, gemini (completion, streaming, interactions_api), groq, huggingface, hyperbolic, llamafile, mira, mistral, moonshot, ollama, openrouter, perplexity, together, xai.

Notes

No provider wire-format changes. Provider response structs that happen to deserialize a JSON field named cached_tokens (e.g. DeepSeek's prompt_tokens_details.cached_tokens, OpenAI's input_tokens_details.cached_tokens) keep those names — those are the provider's API, not our telemetry.
Per OTel guidance, cache_read.input_tokens SHOULD be included in gen_ai.usage.input_tokens. Existing providers already follow that convention; this PR does not change any token-accounting logic.
This is a breaking change for any downstream dashboards/alerts that queried the old attribute name.

Test plan

cargo fmt -- --check
cargo clippy -p rig-core --all-targets --all-features — clean
cargo test -p rig-core --all-features --lib — 555 passed, 0 failed, 8 ignored
grep -rn "gen_ai.usage.cached_tokens" rig/ returns no matches

Provider spans declared `gen_ai.usage.cached_tokens` (non-OTel) but the shared telemetry helpers (`SpanCombinator::record_token_usage`, `openai_chat_completions_compatible::record_usage`) write the OTel- standard `gen_ai.usage.cache_read.input_tokens`. Because `tracing` silently drops `.record()` for fields not declared on the span, cached- token values were being computed and thrown away on every provider whose span did not declare the OTel name. Declare `gen_ai.usage.cache_read.input_tokens` on every affected span across OpenAI, Azure, Cohere, Gemini (incl. interactions_api), Groq, DeepSeek, HuggingFace, Mistral, OpenRouter, Together, xAI, Copilot, and llamafile. Emit `cache_read.input_tokens` from the OpenAI chat- completions compatible helper and the OpenAI Responses API record sites. Do not emit `cache_creation.input_tokens` on OpenAI-family spans — those APIs have no cache-creation concept and a hardcoded 0 would be misleading rather than informative. Anthropic, which does report cache creation, is unchanged. Remove the non-OTel `gen_ai.usage.cached_tokens` attribute from every path this change touches: drop the span declaration and, on paths whose only recording sites are modified here, drop the record calls too. Only the OTel-standard attribute is emitted from these paths. Spans that record cache tokens entirely through their own inline `span.record("gen_ai.usage.cached_tokens", ...)` calls (e.g. non- streaming paths of Groq, DeepSeek, Copilot; standalone files like Galadriel, Hyperbolic, Mira, Moonshot, Ollama, Perplexity, ChatGPT; non-streaming Together and xAI) are out of scope for this change and continue to emit only `cached_tokens`.

Galadriel, Hyperbolic, Mira, Moonshot, and Perplexity all route their streaming paths through `send_compatible_streaming_request`, whose shared `record_usage` helper was updated in the prior commit to write the OTel-standard `gen_ai.usage.cache_read.input_tokens`. These five provider spans still declared `gen_ai.usage.cached_tokens`, so the newly-recorded value was silently dropped by `tracing` — and since none of these files inline-records `cached_tokens` either, their streaming paths were emitting no cache-read metric at all after the prior commit landed. Rename the declaration to `gen_ai.usage.cache_read.input_tokens` on both the non-streaming and streaming span in each file, matching the pattern the prior commit already applied across the other providers. The non-streaming rename is a no-op (no recorder targets the field on that path) but keeps both spans in each file consistent.

Prior two commits partially renamed gen_ai.usage.cached_tokens to the canonical OTel GenAI attribute gen_ai.usage.cache_read.input_tokens but left several providers (and some streaming-vs-non-streaming paths within a provider) emitting the old name. Finish the rename in chatgpt, copilot, deepseek (non-streaming), groq (non-streaming), ollama, together/completion, and xai/completion so every span consistently emits cache_read.input_tokens.

anish-kristipati

Good catch on the caching telemetry issues

alwayys-afk added 3 commits April 24, 2026 12:25

anish-kristipati approved these changes Apr 24, 2026

View reviewed changes

gold-silver-copper added this pull request to the merge queue Apr 28, 2026

Merged via the queue into 0xPlaygrounds:main with commit 3a07cb4 Apr 28, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(telemetry): emit OTel-standard gen_ai.usage.cache_read.input_tokens across providers#1666

fix(telemetry): emit OTel-standard gen_ai.usage.cache_read.input_tokens across providers#1666
gold-silver-copper merged 3 commits into0xPlaygrounds:mainfrom
alwayys-afk:fix/openai-otel-cache-token-attrs

alwayys-afk commented Apr 24, 2026

Uh oh!

anish-kristipati left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alwayys-afk commented Apr 24, 2026

Summary

Providers touched

Notes

Test plan

Uh oh!

anish-kristipati left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants