Phoebe: one-hop loader gap + reconcile-on-re-rate (Hugo decisions) by hhuuggoo · Pull Request #17 · saturncloud/phoebe

hhuuggoo · 2026-06-15T22:53:22Z

Fix pass on the fine-tune rating money path, implementing the latest #16 battery findings and Hugo's re-rate decision (decisions ledger, "Re-rate semantics"). Stacked on yaml-rater-fixes-2. Money path — every behavioral change pairs with an invariant-named test (RED demonstrated before each fix), full gate incl. live-PG integration + e2e is green.

Contracts

Loader one-hop enforcement (closes the $0-derived-fine-tune hole). E3's "one hop only" was enforced on the event-carried path (isDerivationBase in derivedRates/ResolveEvent) but NOT on the file-declared derived_from path. The loader only checked the target was present in pb.base — but an own-rate fine-tune IS in pb.base, so ft:child derived_from: ft:ownrate was accepted, and with a small premium billed $0 (never caught by the round-to-zero guard, which iterates only true bases). The loader now enforces isDerivationBase on every derived_from target. Closing this ALSO closes the round-to-zero coverage hole: a file-declared derived_from can now only point at a true base, so its derived projection is covered by validateDerivedRatesNonZero.
Re-rate RECONCILES (delete superseded rollups), not upsert-only — Hugo's decision. A re-run of a window now converges rated_usage to exactly the latest run's output. A deleted CTE removes any in-window rollup the run no longer produces (became ambiguous/unpriced, or its events vanished), atomically with the upsert in the same snapshot — so a rollup that billed clean in a prior run cannot keep billing at its stale cost. "What the latest run says is what bills." Window predicate is half-open + hour-scoped (cmd/rater enforces hour-aligned bounds), so adjacent windows are never touched; a clean identical re-run is a no-op. Surfaced as RateResult.ReconciledDeletions, logged at INFO. This is an idempotency-contract change (was upsert-only); doc.go, rater.go, and store.go updated to state it. Not a reprice of served traffic — the applied rate is still frozen per row; what changes on re-rate is which rollups exist.
Migration downgrade symmetry. c2f1a3b4d5e6 downgrade dropped billing_event_rating_instant_ix without IF EXISTS (asymmetric with the idempotent upgrade). Now DROP INDEX IF EXISTS; verified up/down/up clean against live PG (and confirmed the old form errors on a second drop).

Fixes + tests

FIX 1 — loader one-hop gap (pricebook.go)

TestLoad_FineTuneDerivingFromFineTuneRejected — ft:child derived_from an own-rate ft: (and a derived ft:) → load error; true-base control loads. RED before.
TestLoad_DerivedRateRoundsToZeroFailsClosed — added a derived_from-a-true-base case whose premium zeroes the base → rejected.

FIX 2 — reconcile-on-re-rate (store.go, rater.go, oracle in rater_test.go)

TestRater_ReRateDeletesSupersededRollup — A bills clean; mutate → B ambiguous; old row GONE, deletions=1. RED before.
TestRater_ReRateSupersedesOneKeepsAnotherSameWindow — in one window, a superseded rollup is deleted while a co-window survivor is updated (disjoint delete/upsert sets).
TestRater_ReRateDeletesVanishedRollup — events vanish on re-rate → deleted.
TestRater_ReRateIdenticalDataIsNoOp — identical re-run deletes nothing; an adjacent-window rollup survives a scoped re-rate.
TestIntegration_ReRateReconciles (live PG) — the deleted CTE fires atomically with the upsert; co-window survivor untouched; identical re-run no-op.
TestIntegration_ReRateReconcileLeavesOtherWindowsUntouched (live PG) — window predicate doesn't delete out-of-scope billing.

FIX 3 — mechanical (c2f1a3b4d5e6_add_rating.py, rater_test.go)

Migration downgrade IF EXISTS (above).
TestRater_RollupCostSelfAudits docstring reworded to what it tests (cost reconstructs from each row's own frozen rate, across heterogeneous base AND derived rollups); the MIN-masking concern is pointed to the ambiguity tests, where the >1-base case is actually excluded from billing — a written rollup is single-rate by construction.
TestRater_FineTuneAmbiguousBaseModelFailsLoud — gave ft:clean two same-base events so the "same ft: id with the same base across events is fine" claim is actually exercised (gate keys on COUNT(DISTINCT base_model), not event count: 2 events → 1 rollup).

Constraints honored

No float for money; oracle mirrors production (oracle now reconciles too). New guards have proven-teeth (RED-demonstrated) tests.
Did not touch WAL/abort/drainer; no base_model re-plumbing.
Mixed/unknown-base under one ft: id kept as the fail-loud AmbiguousBaseEvents tripwire — no "bill the known portion" logic added (can't-happen per the ledger).

🤖 Generated with Claude Code

The last pass enforced E3's one-hop rule on the EVENT-carried path (isDerivationBase in derivedRates/ResolveEvent) but NOT on the FILE-declared derived_from path. The loader only checked the derived_from target was PRESENT in pb.base — but an own-rate fine-tune IS in pb.base, so `ft:child derived_from: ft:ownrate` (a fine-tune deriving from a fine-tune) was ACCEPTED. With a small premium that derived rate bills $0, and the round-to-zero guard (validateDerivedRatesNonZero) iterates only TRUE bases, so it never covered that key — a silent free fine-tune. Enforce isDerivationBase (the one-hop predicate already shared by the SQL projection and the oracle) in the loader's derived_from validation. This closes both the one-hop hole AND the $0-derived coverage hole: a file-declared derived_from can now only point at a true base, so its derived projection IS covered by validateDerivedRatesNonZero. Tests (RED before, GREEN after): - TestLoad_FineTuneDerivingFromFineTuneRejected: ft:child derived_from an own-rate ft: (and a derived ft:) -> load error; true-base control loads. - TestLoad_DerivedRateRoundsToZeroFailsClosed: + a derived_from-a-true-base case whose premium zeroes the base -> rejected (the round-to-zero path). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

Hugo's decision (decisions ledger, 'Re-rate semantics'): when the hourly rater re-runs a window, a rollup that billed CLEAN in a prior run but is now excluded from priced (became ambiguous-base or unpriced, or its events vanished) must have its stale rated_usage row DELETED — 'what the latest run says is what bills.' Previously upsert-only: the stale row kept billing at its old cost forever. Add a `deleted` CTE to the single rating statement: after computing priced, DELETE every rated_usage row whose (auth_id, model_id, window_start) is IN the rated window [$1,$2) but NOT reproduced in priced (NOT EXISTS), atomic with the upsert in the same snapshot. The delete-set and upsert-set are disjoint rows, so Postgres runs both modifying CTEs against one table cleanly. Window predicate is half-open + hour-scoped: bounds are hour-aligned (cmd/rater enforces), so an adjacent window's rollups are never touched. A clean identical re-run reproduces every prior row -> delete matches nothing -> no-op. Surfaced as RateResult.ReconciledDeletions and logged at INFO. The oracle store now reconciles too (mirrors production), so the unit tests exercise the real convergence. doc.go updated: idempotency model is now reconcile, not upsert-only. Tests (RED before, GREEN after): - TestRater_ReRateDeletesSupersededRollup: A bills clean; mutate -> B ambiguous; old row GONE, deletions=1. - TestRater_ReRateDeletesVanishedRollup: events vanish on re-rate -> deleted. - TestRater_ReRateIdenticalDataIsNoOp: identical re-run deletes nothing; an adjacent-window rollup survives a scoped re-rate. - TestIntegration_ReRateReconciles (live PG): the deleted CTE fires atomically with the upsert. - TestIntegration_ReRateReconcileLeavesOtherWindowsUntouched (live PG): window predicate doesn't delete out-of-scope billing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

…trings Mechanical FIX 3 items from the battery: - Migration c2f1a3b4d5e6 downgrade was asymmetric with its idempotent upgrade: it re-adds base_model with ADD COLUMN IF NOT EXISTS, but dropped billing_event_rating_instant_ix WITHOUT IF EXISTS, so a re-run / partially- applied downgrade could error. Make it DROP INDEX IF EXISTS. Verified up/down/up clean against live PG (and confirmed the old form errors on a second drop). - TestRater_RollupCostSelfAudits claimed 'no MIN() masking a divergent rate' but the fixture never exercises a divergent rate within a rollup (it can't: a written rollup is single-model and, by the ambiguity gate, single-base). Reword the docstring to what it tests (cost reconstructs from each row's OWN frozen rate, across heterogeneous base AND derived rollups) and point the MIN-masking concern to the ambiguity tests where the >1-base case is actually excluded from billing. - TestRater_FineTuneAmbiguousBaseModelFailsLoud's docstring claimed 'the same ft: id with the SAME base across events is fine' but ft:clean had a single event. Give it TWO same-base events so the legitimate case is actually exercised: the gate keys on COUNT(DISTINCT base_model), not event count, so a same-ft-same-base multi-event rollup rates as one (2 events, 1 rollup). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

hhuuggoo · 2026-06-16T01:03:09Z

Landed on main via the squashed rating merge (3b22908). Closing the stack.

hhuuggoo and others added 3 commits June 15, 2026 22:43

hhuuggoo mentioned this pull request Jun 15, 2026

Phoebe: reconcile exit-code (loud routine/quiet backfill) + window_start index + single-flight doc #18

Closed

hhuuggoo closed this Jun 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phoebe: one-hop loader gap + reconcile-on-re-rate (Hugo decisions)#17

Phoebe: one-hop loader gap + reconcile-on-re-rate (Hugo decisions)#17
hhuuggoo wants to merge 3 commits into
yaml-rater-fixes-2from
yaml-rater-fixes-3

hhuuggoo commented Jun 15, 2026

Uh oh!

hhuuggoo commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hhuuggoo commented Jun 15, 2026

Contracts

Fixes + tests

Constraints honored

Uh oh!

hhuuggoo commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant