Multi-step tool calls overwrite token counts instead of accumulating (output, reasoning, cache.write lost)

## Bug

In multi-step tool-call assistant messages, `processor.ts` overwrites `assistantMessage.tokens` on each `finish-step` event instead of accumulating additive fields. Only the last step's token counts survive.

**Root cause:** `processor.ts` line ~362:
```ts
ctx.assistantMessage.tokens = usage.tokens  // overwrite!
```

While `ctx.assistantMessage.cost += usage.cost` correctly accumulates, `tokens` is replaced wholesale.

## Impact

For an assistant message with N tool-call steps:

| Field | Current | Correct | Why |
|---|---|---|---|
| `input` | Last step's value | Last step's value | Each step's `inputTokens` includes the full conversation prompt → last step is already correct |
| `cache.read` | Last step's value | Last step's value | Cache read reflects current cache state → snapshot, not cumulative |
| `output` | Last step only | **Sum across all steps** | Each step produces new output tokens |
| `reasoning` | Last step only | **Sum across all steps** | Each step produces new reasoning tokens |
| `cache.write` | Last step only | **Sum across all steps** | Each step may write new entries to cache |
| `total` | API `totalTokens` from last step | **Derived from components** | `totalTokens` = `inputTokens + outputTokens`, but our `input` is adjusted (cache subtracted) |
| `cost` | Correctly accumulated | No change | Already uses `+=` |

Also fixes:
- Context % display and compaction used `total` (which double-counted cached tokens) instead of deriving from components
- Custom provider models without `limit.context` defaulted to `0`, breaking context % display and disabling auto-compaction entirely

## Changes

### 1. Token accumulation in `processor.ts`

Replace the overwrite with field-wise accumulation:
```ts
const prev = ctx.assistantMessage.tokens
ctx.assistantMessage.tokens = {
  total: usage.total,
  input: usage.tokens.input,
  output: (prev?.output ?? 0) + usage.tokens.output,
  reasoning: (prev?.reasoning ?? 0) + usage.tokens.reasoning,
  cache: {
    read: usage.tokens.cache.read,
    write: (prev?.cache?.write ?? 0) + usage.tokens.cache.write,
  },
}
```

### 2. Derive `total` from components in `getUsage()`

`total` is now computed as `input + output + reasoning + cache.read + cache.write` instead of using the API's `totalTokens` (which double-counts cached tokens since AI SDK v6 includes them in `inputTokens`).

### 3. Add `MessageV2.promptSize()` and `MessageV2.totalSize()` helpers

- `totalSize = input + output + reasoning + cache.read + cache.write` — full conversation size after this turn (used for context % display and compaction threshold)
- `promptSize = input + cache.read + cache.write` — current prompt footprint (input tokens sent to the LLM)

`overflow.ts` uses `totalSize` for compaction because output/reasoning tokens become part of the context on the next turn.

### 4. Fix `limit.context` default from `0` to `128,000`

Custom provider models not in models.dev no longer get `context: 0` which broke context % display and disabled auto-compaction. Also `limit.output` defaults to `4,096` instead of `0`.

### 5. ACP usage fix

`acp/agent.ts` now includes `cache.write` in the `used` token count for usage reporting (`input + cache.read + cache.write`).

## Related issues

- #3314 — 2x token display (total included cached, then cached was added again → now fixed by deriving total from components)
- #17566 / PR #17567 — OpenAI reasoning tokens double-counted in cost (reasoning cost fix, separate from this)
- #18440 — Cache write tokens not read from OpenRouter
- #17223 — Cost tracking $0 for custom providers
- #20885 — Token usage 0 with `@ai-sdk/openai` Responses API

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-step tool calls overwrite token counts instead of accumulating (output, reasoning, cache.write lost) #21913

Bug

Impact

Changes

1. Token accumulation in `processor.ts`

2. Derive `total` from components in `getUsage()`

3. Add `MessageV2.promptSize()` and `MessageV2.totalSize()` helpers

4. Fix `limit.context` default from `0` to `128,000`

5. ACP usage fix

Related issues

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Field	Current	Correct	Why
`input`	Last step's value	Last step's value	Each step's `inputTokens` includes the full conversation prompt → last step is already correct
`cache.read`	Last step's value	Last step's value	Cache read reflects current cache state → snapshot, not cumulative
`output`	Last step only	Sum across all steps	Each step produces new output tokens
`reasoning`	Last step only	Sum across all steps	Each step produces new reasoning tokens
`cache.write`	Last step only	Sum across all steps	Each step may write new entries to cache
`total`	API `totalTokens` from last step	Derived from components	`totalTokens` = `inputTokens + outputTokens`, but our `input` is adjusted (cache subtracted)
`cost`	Correctly accumulated	No change	Already uses `+=`

Multi-step tool calls overwrite token counts instead of accumulating (output, reasoning, cache.write lost) #21913

Description

Bug

Impact

Changes

1. Token accumulation in processor.ts

2. Derive total from components in getUsage()

3. Add MessageV2.promptSize() and MessageV2.totalSize() helpers

4. Fix limit.context default from 0 to 128,000

5. ACP usage fix

Related issues

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. Token accumulation in `processor.ts`

2. Derive `total` from components in `getUsage()`

3. Add `MessageV2.promptSize()` and `MessageV2.totalSize()` helpers

4. Fix `limit.context` default from `0` to `128,000`