Skip to content

feat(cli): track local-model cost savings against a paid baseline (#421)#423

Open
justingheorghe wants to merge 1 commit into
getagentseal:mainfrom
justingheorghe:feature/issue-421-local-model-savings
Open

feat(cli): track local-model cost savings against a paid baseline (#421)#423
justingheorghe wants to merge 1 commit into
getagentseal:mainfrom
justingheorghe:feature/issue-421-local-model-savings

Conversation

@justingheorghe
Copy link
Copy Markdown

@justingheorghe justingheorghe commented Jun 1, 2026

Implements the full local-model cost-savings accounting surface for the dashboard, JSON/CSV exports, menubar, GNOME, and macOS clients. Local-model calls now report both the actual spend (still $0) and the counterfactual avoided spend against a user-chosen paid baseline, kept as separate fields so the two never get summed into a misleading "real cost".

Forecasting is intentionally out of scope for this PR per the issue conversation; everything else in the plan is in.

Closes #421

Configuration

A new top-level localModelSavings mapping joins modelAliases in ~/.config/codeburn/config.json. Distinct from modelAliases (which rewrites a model's identity for actual cost): a localModelSavings entry keeps the local call at $0 and reports what the same tokens would have cost on the baseline.

{
"localModelSavings": {
"llama3.1:8b": "gpt-4o",
"qwen2.5-coder:32b": "claude-3-5-sonnet-20241022"
}
}

CLI management mirrors model-alias:

codeburn model-savings # add
codeburn model-savings --list # list
codeburn model-savings --remove # remove

When the same model is also present in modelAliases, the new command prints a one-time warning that local-savings wins for actual cost (and the parser enforces this — cost stays 0 even if a stale modelAliases entry exists).

Semantics

  • costUSD continues to mean actual spend. For a local call mapped via model-savings, it is forced to 0.
  • savingsUSD is the counterfactual baseline cost the same tokens would have incurred against the configured paid model. The baseline is priced through the normal calculateCost pipeline (aliases, canonicalization, fast multiplier, 1h cache multiplier, web search).
  • The two are exposed as separate fields on every aggregate (call, session, project, day, period, model, category, activity, skill, subagent) and a top-level localModelSavings block in the menubar payload.
  • getLocalModelSavingsConfigHash() produces a stable hash for the daily cache to detect when the user changes their baseline mapping and rebuild history.

Code changes (19 files, +747 / -113)

Core pricing & config

  • src/config.ts — new localModelSavings field on CodeburnConfig
  • src/models.tssetLocalModelSavings, getLocalSavingsBaseline, calculateLocalModelSavings, getLocalModelSavingsConfigHash; updated unknown-model hint to mention model-savings; defensive Object.hasOwn lookup so a hostile __proto__ model name cannot reach Object.prototype (regression test included)
  • src/types.tsParsedApiCall gets savingsUSD?, savingsBaselineModel?, isLocalSavings?; SessionSummary and ProjectSummary get totalSavingsUSD; every per-call breakdown (modelBreakdown, categoryBreakdown, skillBreakdown, subagentBreakdown) gets savingsUSD
  • src/parser.ts — new applyLocalModelSavings helper applied to Claude parse, providerCallToTurn, and cachedCallToApiCall; buildSessionSummary and the three ProjectSummary construction sites track savings totals
  • src/daily-cache.tsDAILY_CACHE_VERSION 7 → 8, MIN_SUPPORTED_VERSION 8 (old v7 backups remain), savingsUSD on day/model/category/provider, savingsConfigHash on the cache; ensureCacheHydrated accepts a hash and discards cached days when the hash mismatches
  • src/day-aggregator.tsDailyEntry and PeriodData carry savingsUSD; buildPeriodDataFromDays rolls up savings per model/category
  • src/menubar-json.ts — new LocalModelSavings type, distinct from optimize.savingsUSD and routingWaste.totalSavingsUSD; PeriodData.savingsUSD; savings on topModels, topActivities, topProjects, topSessions, daily history, and per-daily top-models

CLI

  • src/main.tspreAction calls setLocalModelSavings(config.localModelSavings ?? {}) and threads the hash to ensureCacheHydrated; new model-savings command (set/list/remove + alias-conflict warning); buildJsonReport adds savings to overview/daily/projects/models/activities/skills/subagents/topSessions; buildPeriodData carries savings; status --format menubar-json computes a localModelSavings rollup (byModel + byProvider, top-5 each) and threads savings into topModels/topProjects/topSessions/daily history; status --format json adds today.savings, month.savings, and an optional localModelSavings block
  • src/models-report.tsModelReportRow gets savingsUSD + savingsBaselineModel; sort key is now cost + savings; default --min-cost 0.01 filter ORs in savingsUSD >= minCost so local models with $0 actual cost but > 0 savings still surface; new Saved column in table/markdown/CSV; JSON includes savingsUSD + savingsBaselineModel; drop priority keeps Saved even when other columns are pruned for narrow terminals
  • src/dashboard.tsx — green "saved $X by local models" footer line in the overview when any savings are present
  • src/export.tsSaved (CODE) column on summary.csv, daily.csv, models.csv, projects.csv, sessions.csv; sort orders include savings

GUI consumers

  • mac/Sources/CodeBurnMenubar/Data/MenubarPayload.swift — new LocalModelSavings + LocalModelSavingsByModel + LocalModelSavingsByProvider Codable structs; savings fields on CurrentBlock (new required localModelSavings), DailyModelBreakdown, DailyHistoryEntry, ModelEntry, SessionModelEntry, SessionDetailEntry, ProjectEntry, TopSessionEntry, ActivityEntry; every struct uses custom init(from:) with decodeIfPresent defaults for backward compatibility; MenubarPayload.empty updated
  • mac/Sources/CodeBurnMenubar/Views/HeroSection.swift — green leaf "Saved $X with local models" caption below the hero amount (shown only when savings > 0)
  • mac/Sources/CodeBurnMenubar/Views/ModelsSection.swift — new green Saved column on the model row, distinct from Cost
  • gnome/indicator.js — hero meta line adds "saved $X" when current.localModelSavings.totalUSD > 0; _buildModelRow gets a codeburn-model-saved column for the same data

Tests

  • tests/local-model-savings.test.ts (NEW, 9 tests, 94 lines) — config helpers, hash stability, baseline pricing through calculateCost, defensive __proto__ / constructor / toString lookups
  • tests/parser-local-savings.test.ts (NEW, 4 tests, 138 lines) — end-to-end on parsed JSONL: unconfigured local stays at $0; configured local flips to $0 + savings; savings precedence over modelAliases
  • tests/day-aggregator-savings.test.ts (NEW, 2 tests, 150 lines) — day/model/category/provider savings rollup; buildPeriodDataFromDays savings
  • tests/menubar-savings.test.ts (NEW, 6 tests, 91 lines) — localModelSavings block, savings on topModels/topProjects/topSessions/topActivities, history daily savings
  • tests/cli-model-savings.test.ts (NEW, 3 tests, 76 lines) — set/list/remove flow, alias-conflict warning, remove-unknown error
  • tests/daily-cache.test.ts (UPDATED, +48 lines) — new savingsConfigHash field on test fixtures, new ensureCacheHydrated: savings config invalidation describe block (hash mismatch drops cached days, hash match keeps them)
  • tests/day-aggregator.test.ts (UPDATED) — fixture updates for savingsUSD: 0 on model/category/provider
  • tests/models-report.test.ts (UPDATED) — fixture updates for savingsUSD + savingsBaselineModel; markdown/CSV header expectations include Saved

Verification

All commands run on darwin, Node >=22.13.0, against commit 69b1736 refactor(cli): share persistent-codeburn resolver; tighten Antigravity hook ownership as the base.

TypeScript

$ ./node_modules/.bin/tsc --noEmit
$ echo $?
0

Build

$ npm run build

codeburn@0.9.11 build
node scripts/bundle-litellm.mjs && tsup && node -e "..."

Bundled 3673 models → src/data/litellm-snapshot.json
CLI Building entry: src/main.ts
CLI Using tsconfig: tsconfig.json
CLI tsup v8.5.1
CLI Using tsup config: /Users/justingheorghe/Documents/Software Projects/codeburn/tsup.config.ts
CLI Target: node20
CLI Cleaning output folder
ESM Build start
ESM dist/main.js 914.65 KB
ESM dist/main.js.map 1.64 MB
ESM ⚡️ Build success in 47ms
```

Vitest — full suite

$ ./node_modules/.bin/vitest run --reporter=default

Test Files 73 passed (73)
Tests 1028 passed (1028)
Duration 12.87s

Breakdown of the new savings test files:

$ ./node_modules/.bin/vitest run tests/local-model-savings.test.ts \
tests/parser-local-savings.test.ts \
tests/day-aggregator-savings.test.ts \
tests/menubar-savings.test.ts \
tests/cli-model-savings.test.ts
Test Files 5 passed (5)
Tests 25 passed (25)

Swift — macOS build

$ cd mac && swift build
[16/19] Write Objects.LinkFileList
[17/19] Linking CodeBurnMenubar
[18/19] Applying CodeBurnMenubar
Build complete!
$ echo $?
0

Swift — tests

The `mac/Tests/CodeBurnMenubarTests` target fails to compile in this environment with `error: no such module 'Testing'` (the Swift Testing framework is not available in the installed Swift toolchain). This is pre-existing on `main` — I verified it by stashing my changes and running `swift test` against `69b1736` and observed the identical `no such module 'Testing'` failure. No files under `mac/Tests/` were modified by this PR; the failure is independent of these changes. `swift build` (the compile step) passes cleanly.

CLI smoke test

```bash
$ TMP=$(mktemp -d)
$ HOME="$TMP" ./dist/cli.js model-savings "foo" "gpt-4o"
Savings mapping saved: foo -> gpt-4o
Config: /var/folders/.../tmp.D9UKCxSp2t/.config/codeburn/config.json

$ cat "$TMP/.config/codeburn/config.json"
{
"localModelSavings": {
"foo": "gpt-4o"
}
}
```

Design notes

  • Savings never sums with cost. Every aggregate and every surface reports the two as separate fields. The `buildTopModels` / `buildTopProjects` / `buildTopSessions` sort keys use `cost + savings` only for ranking — the displayed values stay separate.
  • Explicit > automatic for local-model detection. I did not change `looksLikeLocalModel` (which only suppresses an "unknown model" warning) into a billing semantic. The user's config is the source of truth. This avoids surprising users who happen to have a model name like `qwen2.5-coder:32b` that LiteLLM does index.
  • Daily cache invalidation is on the config hash, not on version. Version bump (7 → 8) handles the schema change. The `savingsConfigHash` separately forces a rebuild when the user changes their baseline mapping (e.g. `gpt-4o` → `gpt-5`), so historical saved-spend numbers never lie about a baseline that is no longer current.
  • `modelAliases` semantics are unchanged. A user who has relied on `model-alias` for actual-spend corrections (e.g. renaming a vendor model to a known model with pricing) sees no behavior change. New `model-savings` is opt-in.
  • Swift Codable backward compatibility. Every new field on a payload struct uses a custom `init(from:)` with `decodeIfPresent` and a sensible default. Old payloads (no `localModelSavings`, no `savingsUSD`) decode into a `CurrentBlock` whose `localModelSavings` is the empty `{ totalUSD: 0, calls: 0, byModel: [], byProvider: [] }` block. The UI gates all savings display on `> 0` so the user never sees a placeholder "saved $0.00".
  • GNOME schema discovery. `gnome/dataClient.js` parses permissively and stores the whole payload, so new fields are inert until the indicator learns to render them. The indicator picks up `current.localModelSavings.totalUSD` for the hero meta and `topModels[].savingsUSD` for the model row.

Out of scope (per the issue discussion)

  • Forecasting based on local-savings ratio. Requires an additional model layer and a longer-history read path. Can be added on top of the historical `savingsUSD` totals this PR exposes without re-touching the core accounting.
  • `codeburn model-savings` integration into the `model-alias` command. The plan deliberately keeps them separate so existing `model-alias` semantics stay intact.

Related issues

…tagentseal#421)

Add a new `localModelSavings` config and `codeburn model-savings` CLI
that maps a local-model name (e.g. llama3.1:8b) to a paid baseline
(e.g. gpt-4o). The local call still costs $0; the new `savingsUSD`
field tracks the counterfactual spend avoided by running locally and
is reported separately from `costUSD` everywhere a number is shown.

* Parser normalization (`applyLocalModelSavings`) runs on Claude
  parse, direct provider calls, and the cached-call path. It forces
  `costUSD` to 0 and attaches `savingsUSD` + `savingsBaselineModel`
  + `isLocalSavings` on the `ParsedApiCall`. Local-savings wins for
  actual cost even when the same model is also in `modelAliases`.
* Session, project, day, model, category, activity, skill, and
  subagent rollups all carry `savingsUSD` alongside `costUSD`.
* `status --format json` adds `today.savings` and `month.savings`.
* `status --format menubar-json` adds a `current.localModelSavings`
  block (totalUSD, calls, byModel, byProvider) plus savings on
  topModels, topProjects, topSessions, topActivities, and history
  daily entries. Schema fields default-decode for backward compat.
* `report --format json` adds savings across overview/daily/
  projects/models/activities/skills/subagents/topSessions, with
  the active paid baseline name on each model row.
* `models` command gains a `Saved` column on table/markdown/CSV
  and a `savingsUSD`/`savingsBaselineModel` pair in JSON. Default
  `--min-cost 0.01` filter now ORs in `savingsUSD >= minCost` so
  local models with $0 actual cost but >0 savings still surface.
* CSV/JSON exports add a `Saved (CODE)` column on summary/daily/
  models/projects/sessions.
* Dashboard TUI shows a green 'saved $X by local models' footer
  line in the overview when any savings are present.
* macOS Swift payload gains a `LocalModelSavings` Codable block
  and savings fields on every model/activity/session/daily
  struct. Hero shows a green leaf 'Saved $X' caption, models
  section gets a green `Saved` column. `swift build` clean.
* GNOME indicator adds 'saved $X' to the hero meta line and a
  `codeburn-model-saved` column to the model row.
* Daily cache schema bumped to v8 (`savingsUSD` on day/model/
  category/provider). `savingsConfigHash` invalidates the cache
  when the user changes their baseline mapping so historical
  saved-spend numbers never lie about a stale baseline.
* Defensive `Object.hasOwn` lookup in `getLocalSavingsBaseline`
  blocks the prototype-pollution test that previously surfaced via
  the savings path with a hostile `__proto__` model name.
* New tests (5 files, 25 tests, 549 lines) cover pricing helpers,
  end-to-end parser normalization, day aggregator savings,
  menubar payload savings, CLI set/list/remove, and
  daily-cache hash invalidation. Existing tests for daily-cache
  / day-aggregator / models-report updated for the new fields.
  Full vitest suite: 1028/1028 passing across 73 test files.
  `tsc --noEmit` clean. `npm run build` clean.
  (Note: `mac/Tests` has a pre-existing `no such module 'Testing'`
  environment error on the installed Swift toolchain, confirmed
  on `main` before this PR; not caused by these changes.)
@camggould
Copy link
Copy Markdown

@justingheorghe thanks for taking this on! Can you include some screenshots of CLI/Mac dashboards to show how this is being displayed?

What is the current UX for mapping the local model to various paid models? Can it be done easily via the application UI rather than config tweaks? That's a major convenience feature for evaluating cost savings between models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Local model cost saving reports

2 participants