Phoebe: one-hop loader gap + reconcile-on-re-rate (Hugo decisions)#17
Closed
hhuuggoo wants to merge 3 commits into
Closed
Phoebe: one-hop loader gap + reconcile-on-re-rate (Hugo decisions)#17hhuuggoo wants to merge 3 commits into
hhuuggoo wants to merge 3 commits into
Conversation
The last pass enforced E3's one-hop rule on the EVENT-carried path (isDerivationBase in derivedRates/ResolveEvent) but NOT on the FILE-declared derived_from path. The loader only checked the derived_from target was PRESENT in pb.base — but an own-rate fine-tune IS in pb.base, so `ft:child derived_from: ft:ownrate` (a fine-tune deriving from a fine-tune) was ACCEPTED. With a small premium that derived rate bills $0, and the round-to-zero guard (validateDerivedRatesNonZero) iterates only TRUE bases, so it never covered that key — a silent free fine-tune. Enforce isDerivationBase (the one-hop predicate already shared by the SQL projection and the oracle) in the loader's derived_from validation. This closes both the one-hop hole AND the $0-derived coverage hole: a file-declared derived_from can now only point at a true base, so its derived projection IS covered by validateDerivedRatesNonZero. Tests (RED before, GREEN after): - TestLoad_FineTuneDerivingFromFineTuneRejected: ft:child derived_from an own-rate ft: (and a derived ft:) -> load error; true-base control loads. - TestLoad_DerivedRateRoundsToZeroFailsClosed: + a derived_from-a-true-base case whose premium zeroes the base -> rejected (the round-to-zero path). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Hugo's decision (decisions ledger, 'Re-rate semantics'): when the hourly rater re-runs a window, a rollup that billed CLEAN in a prior run but is now excluded from priced (became ambiguous-base or unpriced, or its events vanished) must have its stale rated_usage row DELETED — 'what the latest run says is what bills.' Previously upsert-only: the stale row kept billing at its old cost forever. Add a `deleted` CTE to the single rating statement: after computing priced, DELETE every rated_usage row whose (auth_id, model_id, window_start) is IN the rated window [$1,$2) but NOT reproduced in priced (NOT EXISTS), atomic with the upsert in the same snapshot. The delete-set and upsert-set are disjoint rows, so Postgres runs both modifying CTEs against one table cleanly. Window predicate is half-open + hour-scoped: bounds are hour-aligned (cmd/rater enforces), so an adjacent window's rollups are never touched. A clean identical re-run reproduces every prior row -> delete matches nothing -> no-op. Surfaced as RateResult.ReconciledDeletions and logged at INFO. The oracle store now reconciles too (mirrors production), so the unit tests exercise the real convergence. doc.go updated: idempotency model is now reconcile, not upsert-only. Tests (RED before, GREEN after): - TestRater_ReRateDeletesSupersededRollup: A bills clean; mutate -> B ambiguous; old row GONE, deletions=1. - TestRater_ReRateDeletesVanishedRollup: events vanish on re-rate -> deleted. - TestRater_ReRateIdenticalDataIsNoOp: identical re-run deletes nothing; an adjacent-window rollup survives a scoped re-rate. - TestIntegration_ReRateReconciles (live PG): the deleted CTE fires atomically with the upsert. - TestIntegration_ReRateReconcileLeavesOtherWindowsUntouched (live PG): window predicate doesn't delete out-of-scope billing. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…trings Mechanical FIX 3 items from the battery: - Migration c2f1a3b4d5e6 downgrade was asymmetric with its idempotent upgrade: it re-adds base_model with ADD COLUMN IF NOT EXISTS, but dropped billing_event_rating_instant_ix WITHOUT IF EXISTS, so a re-run / partially- applied downgrade could error. Make it DROP INDEX IF EXISTS. Verified up/down/up clean against live PG (and confirmed the old form errors on a second drop). - TestRater_RollupCostSelfAudits claimed 'no MIN() masking a divergent rate' but the fixture never exercises a divergent rate within a rollup (it can't: a written rollup is single-model and, by the ambiguity gate, single-base). Reword the docstring to what it tests (cost reconstructs from each row's OWN frozen rate, across heterogeneous base AND derived rollups) and point the MIN-masking concern to the ambiguity tests where the >1-base case is actually excluded from billing. - TestRater_FineTuneAmbiguousBaseModelFailsLoud's docstring claimed 'the same ft: id with the SAME base across events is fine' but ft:clean had a single event. Give it TWO same-base events so the legitimate case is actually exercised: the gate keys on COUNT(DISTINCT base_model), not event count, so a same-ft-same-base multi-event rollup rates as one (2 events, 1 rollup). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Contributor
Author
|
Landed on main via the squashed rating merge (3b22908). Closing the stack. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fix pass on the fine-tune rating money path, implementing the latest #16 battery findings and Hugo's re-rate decision (decisions ledger, "Re-rate semantics"). Stacked on
yaml-rater-fixes-2. Money path — every behavioral change pairs with an invariant-named test (RED demonstrated before each fix), full gate incl. live-PG integration + e2e is green.Contracts
Loader one-hop enforcement (closes the $0-derived-fine-tune hole). E3's "one hop only" was enforced on the event-carried path (
isDerivationBaseinderivedRates/ResolveEvent) but NOT on the file-declaredderived_frompath. The loader only checked the target was present inpb.base— but an own-rate fine-tune IS inpb.base, soft:child derived_from: ft:ownratewas accepted, and with a small premium billed $0 (never caught by the round-to-zero guard, which iterates only true bases). The loader now enforcesisDerivationBaseon everyderived_fromtarget. Closing this ALSO closes the round-to-zero coverage hole: a file-declaredderived_fromcan now only point at a true base, so its derived projection is covered byvalidateDerivedRatesNonZero.Re-rate RECONCILES (delete superseded rollups), not upsert-only — Hugo's decision. A re-run of a window now converges
rated_usageto exactly the latest run's output. AdeletedCTE removes any in-window rollup the run no longer produces (became ambiguous/unpriced, or its events vanished), atomically with the upsert in the same snapshot — so a rollup that billed clean in a prior run cannot keep billing at its stale cost. "What the latest run says is what bills." Window predicate is half-open + hour-scoped (cmd/rater enforces hour-aligned bounds), so adjacent windows are never touched; a clean identical re-run is a no-op. Surfaced asRateResult.ReconciledDeletions, logged at INFO. This is an idempotency-contract change (was upsert-only);doc.go,rater.go, andstore.goupdated to state it. Not a reprice of served traffic — the applied rate is still frozen per row; what changes on re-rate is which rollups exist.Migration downgrade symmetry.
c2f1a3b4d5e6downgrade droppedbilling_event_rating_instant_ixwithoutIF EXISTS(asymmetric with the idempotent upgrade). NowDROP INDEX IF EXISTS; verified up/down/up clean against live PG (and confirmed the old form errors on a second drop).Fixes + tests
FIX 1 — loader one-hop gap (
pricebook.go)TestLoad_FineTuneDerivingFromFineTuneRejected— ft:childderived_froman own-rate ft: (and a derived ft:) → load error; true-base control loads. RED before.TestLoad_DerivedRateRoundsToZeroFailsClosed— added aderived_from-a-true-base case whose premium zeroes the base → rejected.FIX 2 — reconcile-on-re-rate (
store.go,rater.go, oracle inrater_test.go)TestRater_ReRateDeletesSupersededRollup— A bills clean; mutate → B ambiguous; old row GONE, deletions=1. RED before.TestRater_ReRateSupersedesOneKeepsAnotherSameWindow— in one window, a superseded rollup is deleted while a co-window survivor is updated (disjoint delete/upsert sets).TestRater_ReRateDeletesVanishedRollup— events vanish on re-rate → deleted.TestRater_ReRateIdenticalDataIsNoOp— identical re-run deletes nothing; an adjacent-window rollup survives a scoped re-rate.TestIntegration_ReRateReconciles(live PG) — thedeletedCTE fires atomically with the upsert; co-window survivor untouched; identical re-run no-op.TestIntegration_ReRateReconcileLeavesOtherWindowsUntouched(live PG) — window predicate doesn't delete out-of-scope billing.FIX 3 — mechanical (
c2f1a3b4d5e6_add_rating.py,rater_test.go)IF EXISTS(above).TestRater_RollupCostSelfAuditsdocstring reworded to what it tests (cost reconstructs from each row's own frozen rate, across heterogeneous base AND derived rollups); the MIN-masking concern is pointed to the ambiguity tests, where the >1-base case is actually excluded from billing — a written rollup is single-rate by construction.TestRater_FineTuneAmbiguousBaseModelFailsLoud— gaveft:cleantwo same-base events so the "same ft: id with the same base across events is fine" claim is actually exercised (gate keys on COUNT(DISTINCT base_model), not event count: 2 events → 1 rollup).Constraints honored
ft:id kept as the fail-loudAmbiguousBaseEventstripwire — no "bill the known portion" logic added (can't-happen per the ledger).🤖 Generated with Claude Code