Skip to content

Add TTL cache + cache exclusion handling#299

Open
Elijah Kurien (elijah0528) wants to merge 1 commit into
mainfrom
elijah/anthropic-cost-consistency-fix
Open

Add TTL cache + cache exclusion handling#299
Elijah Kurien (elijah0528) wants to merge 1 commit into
mainfrom
elijah/anthropic-cost-consistency-fix

Conversation

@elijah0528

Copy link
Copy Markdown

Add support for lingua to take in the TTL cache that Anthropic supports and also convert it to the openAI convention so logging is consistent

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 69f7224b26

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "Codex (@codex) review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "Codex (@codex) address that feedback".

Comment on lines +377 to +384
prompt_cache_creation_5m_tokens: usage
.get("cache_creation")
.and_then(|c| c.get("ephemeral_5m_input_tokens"))
.and_then(Value::as_i64),
prompt_cache_creation_1h_tokens: usage
.get("cache_creation")
.and_then(|c| c.get("ephemeral_1h_input_tokens"))
.and_then(Value::as_i64),

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep TTL usage parsing behind typed boundaries

The project guide in AGENTS.md says provider-boundary code must not inspect serde_json::Value directly for provider semantics and must deserialize into typed structs/enums first; these new TTL fields pluck cache_creation and its children directly from the raw map. Since Anthropic already has generated Usage/CacheCreation types for this shape, parsing the TTL split through a typed view avoids bypassing the typed-boundary check and drifting from the canonical serde renames.

Useful? React with 👍 / 👎.

Comment on lines +411 to +425
pub fn inclusive_prompt_tokens(&self) -> Option<i64> {
if !self.prompt_tokens_exclude_cache {
return self.prompt_tokens;
}
if self.prompt_tokens.is_none()
&& self.prompt_cached_tokens.is_none()
&& self.prompt_cache_creation_tokens.is_none()
{
return None;
}
Some(
self.prompt_tokens.unwrap_or(0)
+ self.prompt_cached_tokens.unwrap_or(0)
+ self.prompt_cache_creation_tokens.unwrap_or(0),
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Use inclusive prompt totals for OpenAI-style usage

This helper is added to compute OpenAI-convention prompt totals, but the provider serializers still read self.prompt_tokens directly for OpenAI/Responses/Google. For an Anthropic or Converse response with input_tokens: 10 and cache read/write tokens, transforming to OpenAI-style output will still emit prompt_tokens: 10 and an understated total_tokens, even though the new flag says those prompt tokens exclude cache buckets. Wire this helper into the OpenAI-style to_provider_value paths and Google UsageMetadata conversion so cached tokens are counted consistently.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant