Skip to content

fix(langchain): stop double-counting anthropic cache tokens in prompt totals#510

Merged
Abhijeet Prasad (AbhiPrasad) merged 3 commits into
mainfrom
fix/langchain-anthropic-cache-double-count
Jun 11, 2026
Merged

fix(langchain): stop double-counting anthropic cache tokens in prompt totals#510
Abhijeet Prasad (AbhiPrasad) merged 3 commits into
mainfrom
fix/langchain-anthropic-cache-double-count

Conversation

@AbhiPrasad

Copy link
Copy Markdown
Member

bhaveshklaviyo and others added 3 commits June 9, 2026 18:16
… totals

langchain-anthropic has folded cache read/creation tokens into
usage_metadata input_tokens since 0.2.3 (versions before that don't emit
input_token_details at all), and langchain-aws does the same — per the
langchain-core UsageMetadata contract, input_token_details is a breakdown
of input_tokens, not an addition to it.

The cache normalization from #411/#445 detected "separate cache token
accounting" by the presence of cache_creation/ephemeral_* detail keys,
which langchain-anthropic always emits, so every cached Anthropic call
had cache tokens added to prompt_tokens a second time. With a warm cache
this roughly doubles reported prompt tokens (e.g. a real trace reported
75,387 prompt tokens for a 37,694-token request with 37,324 cache reads
and 369 cache writes).

Detect separate accounting arithmetically instead: only fold cache
tokens into prompt/total when they exceed the reported prompt total,
which is impossible under the UsageMetadata contract but is exactly the
inconsistency the original normalization (BT-5150) was added to repair.

Strengthen the VCR prompt-caching test to assert span prompt/total
tokens equal the usage_metadata the model reported, and add unit
coverage for the folded (Anthropic), subset (OpenAI), and separate
(legacy) conventions.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@AbhiPrasad Abhijeet Prasad (AbhiPrasad) merged commit bf9dfaf into main Jun 11, 2026
82 checks passed
@AbhiPrasad Abhijeet Prasad (AbhiPrasad) deleted the fix/langchain-anthropic-cache-double-count branch June 11, 2026 18:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants