Rating fix pass: ratify quantize-then-multiply, sub-9dp guard, base_model plumbing (#14 battery + decisions)#15
Rating fix pass: ratify quantize-then-multiply, sub-9dp guard, base_model plumbing (#14 battery + decisions)#15hhuuggoo wants to merge 4 commits into
Conversation
…le + prove teeth E1 stores the applied per-token rate on each rated_usage row, which REQUIRES the billing rate be a 9dp NUMERIC the row can hold — so rating quantizes the per-token rate (premium applied to the exact base rate first) to 9dp, then multiplies by token counts. Hugo ratified this quantize-then-multiply model as the spec; it differs from the old sum-then-round only on a sub-nano premium residue, and is what self-auditing rows demand (the stored 9dp rate x tokens must reconstruct the cost). The code already quantized (the SQL price table is NUMERIC(20,9); the in-Go oracle fed .Quantized()), but two surfaces lied or were mis-calibrated: - doc.go's ROUNDING + PRODUCTION-vs-ORACLE sections still documented sum-then-round. Rewrite to document quantize-then-multiply, why E1 forces it, and that it is a deliberate ratified choice. - store_integration_test.go's TestIntegration_RateWindow_ConformsToOracle fed the oracle the UN-quantized resolved rate while production bills the quantized one — a latent miscalibration (the existing fixture has no residue, so it never diverged). Feed rate.Quantized() so the oracle mirrors production. - oracle_test.go's Rate() doc claimed sum-then-round; correct it (Rate is faithful only when fed a quantized rate, as all conformance callers now do). Add TestConformance_OracleQuantizesBeforeMultiply_OnResidue: a 1-nano base x 1.5 fine-tune (0.0000000015 -> 0.000000002) rated through the REAL SQL, asserting the SQL agrees with the quantized oracle (6 nano over 3 tokens) AND proving the guard has teeth — the old sum-then-round value (5 nano) demonstrably differs, so a revert to the un-quantized oracle flips the test RED. Closes the latent miscalibration permanently. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
A per-token rate finer than 9dp (nano-USD, the NUMERIC(20,9) money scale) is silently rounded at projection. One that rounds to zero — e.g. "0.0000000001" -> 0.000000000 — would serve the model for FREE, the precise silent-lost-revenue outcome this package exists to prevent. An operator who writes a nonzero number intends a nonzero price, so a round-to-zero is a MIS-PRICED model, not a free one. Guard at price-file LOAD time (parseNonNegRate, the per-rate validator): reject a rate that is nonzero in the file but quantizes to $0 at 9dp, fail-closed with an error naming the offending model + field + rate. A literal "0" (an intentional free rate) is still allowed — the guard targets only "nonzero number we'd round to zero". Covers base rates AND fine-tune own-rates (same parseRate3 path). Test TestLoad_SubNanoRateRoundsToZeroFailsClosed pins all four arms: sub-nano nonzero rejected (naming the model), half-up boundary 0.0000000005 -> 1 nano still loads, literal $0 still loads, fine-tune own-rate sub-nano rejected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ing event (E3) A fine-tune's price key is ft:<checkpoint> (E3) — a per-deployment checkpoint id the price file never names. To price it at base x premium the rater needs the base, which the file can't hold. Hugo's decision (option a): stamp the base model onto the metering event at deploy time, where Atlas has already validated it present (a fine-tune cannot deploy without a base — a hard precondition). Phoebe now CARRIES and USES it. Plumbing (additive, hot-path-safe; empty base_model is valid for a base model): - metering.Event gains BaseModel; billing_event gains a base_model column (0001 SQL + the billing_event Alembic create, plus an idempotent ADD COLUMN IF NOT EXISTS in the rating migration / 0002 so an already-applied billing_event picks it up). - identity: new X-Saturn-Base-Model header (the Atlas-side injection is the documented seam), read defensively (absent = ""), carried on Identity and stamped onto the Event in proxy.emit — for BOTH the completion and pre-header-abort paths. - drain store: base_model added to the INSERT column list + eventArgs (nullStr). Rater (the money path): - PriceBook.ResolveEvent(modelID, baseModel): direct model_id price wins; else an ft: id with a priced base_model resolves to base x premium (one hop); else ErrNoPrice. - store.go projects a second TEMP table rating_derived (base_model -> premium-applied, 9dp-quantized rate) and the SQL COALESCEs direct-over-derived, deriving ONLY for an ft: model_id that missed the direct join and carries a base_model. FAIL-CLOSED INVARIANT: an ft: model_id with an EMPTY base_model is a base_model propagation bug, NOT a free model — it resolves to ErrNoPrice, is counted UNPRICED, and screams (exit-nonzero), never silently $0-billed. Pinned by name in TestRater_FineTuneWithoutBaseModelFailsLoud and the SQL TestIntegration_FineTunePricesViaBaseModel. Tests: TestRater_FineTunePricesViaBaseModelOnEvent, TestResolveEvent_FineTuneViaBaseModel, TestRater_FineTuneWithoutBaseModelFailsLoud, the live-PG TestIntegration_FineTunePricesViaBaseModel, and the end-to-end TestE2E_FineTuneBillsAtBaseTimesPremium (ft: request carrying the base_model header bills at base x 1.5 through the whole pipe). doc.go/pricebook.go lose the "flagged/unlinked gap" non-goal — ft: pricing now works via the event's base_model. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Grouped mechanical cleanups, no behavioral money change beyond the event_count width: - event_count widened to BIGINT end to end (the earlier fast-follow's widening didn't carry into the YAML-rater rewrite): the SQL COUNT(*) cast (::int -> ::bigint), the rated_usage column (0002 SQL + rating Alembic + the integration schema), and the e2e scan (int -> int64). An INTEGER column silently caps a hot (auth, model, hour) bucket at 2^31 while SUM(event_count) is already ::bigint. - Deleted the dead ErrDerivationChain sentinel: it was documented as returned by the oracle but never was. One-hop derivation is enforced at LOAD (buildPriceBook rejects multi-hop / dangling derived_from) and in ResolveEvent, so a deeper chain can never reach the oracle — there is nothing to return. Replaced with a comment stating where the invariant actually lives. - Renamed the overclaiming TestConformance_SQLModelMatchesRateOracle -> TestOracleModel_SelfConsistent: it runs NO SQL, it pins the in-Go oracle's self-consistency; the REAL SQL conformance is the integration test (cross-reference fixed). - TestLoad_MalformedYAMLFailsClosed: the doc now accurately enumerates every shape it pins, INCLUDING the inconsistent-premium-policy cases (multiplier-no-factor, markup-with-factor, unknown-policy) it already tested but didn't mention. - Corrected the stale "pointer-not-copy rule" comment on TestLoad_FineTuneIdentityPremium (it tests identity-default = base exactly, not the propagation rule). - Fixed RateResult.TotalCost doc: the SQL COALESCEs SUM to 0, so an empty window returns "0", never "" — doc now matches reality. - migrations/README: the fine-tune base-linkage "gap (flagged)" is now "closed" (carried on billing_event.base_model). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
🔋 Battery review — Round 1 (status: ESCALATE)Tier-2 review of the fine-tune-pricing fix pass. 19 raw → 5 refuted → 14 confirmed + 5 persona. The base_model plumbing works, but the derived-rate path has correctness gaps the round-1 fixes didn't cover. Not merge-ready. 🔴 High-sev mechanical (will fix) — the round-to-zero guard misses the derived pathThe sub-9dp guard added last pass only protects file-declared rates. A fine-tune premium that drives the derived rate to $0 (factor 🛑 Decisions for Hugo1. Same 2. Own-rate fine-tune as a derivation base → multi-hop pricing (the one-hop contract). A fine-tune with its own explicit rate is currently projected as a possible base for another fine-tune's premium → a fine-tune deriving from a fine-tune, which the one-hop rule was meant to forbid. The Go oracle mirrors this, so the conformance test can't catch it. Decide the intended one-hop contract: exclude own-rate fine-tunes from being derivation bases (my lean — matches "one hop only"), or accept fine-tune-of-fine-tune pricing. Then the filter + oracle get fixed together. 🟡 Smaller design/contract (I'll fix once #1–#2 land)
Battery |
|
Landed on main via the squashed rating merge (3b22908). Closing the stack. |
Fix pass on PR #14 (the YAML-rater rework), implementing Hugo's E1–E4 decisions and the round-1 battery findings. Stacked on
yaml-rater. Money-path; every behavioral change pairs with an invariant-named test, and the oracle/conformance discipline is preserved (proven-teeth on the new sub-nano residue fixture).Contracts (read first — the load-bearing surface)
base_modelevent field +billing_event.base_modelcolumn. A fine-tune's HF base id rides on the metering event (E3, option a), stamped by Atlas at deploy. Atlas-side plumbing seam: atlas-auth must inject the base id on theX-Saturn-Base-Modelheader and add it to Traefik'sauthResponseHeadersallowlist (exactly as forX-Saturn-Auth-Id). Phoebe reads it defensively (absent =""). Anft:model with an emptybase_modelfails loud (ErrNoPrice / UNPRICED), never silently $0.rated_usagerow × tokens exactly reconstructs the billed cost (E1 self-auditing). This differs from sum-then-round only on a sub-nano premium residue and is a forced consequence of storing the rate on the row.0is still allowed.The fixes (one commit each)
doc.go's ROUNDING / PRODUCTION-vs-ORACLE sections (they still documented sum-then-round); fixed the integration oracle to feedrate.Quantized()(it fed the un-quantized rate while production bills the quantized one — a latent miscalibration, since the existing fixture had no residue); correctedRate()'s doc. AddedTestConformance_OracleQuantizesBeforeMultiply_OnResidue— a 1-nano × 1.5 = 0.0000000015 → 0.000000002 fixture that bills 6 nano over 3 tokens and proves the guard has teeth (sum-then-round gives 5 nano; a revert flips it RED).parseNonNegRate), naming the offending model/field/rate.TestLoad_SubNanoRateRoundsToZeroFailsClosedpins all arms (sub-nano rejected, half-up boundary loads, literal $0 loads, fine-tune own-rate covered).base_modelsoft:<checkpoint>ids price.metering.Event.BaseModel+billing_event.base_modelcolumn (0001 SQL + billing_event Alembic create, plus idempotentADD COLUMN IF NOT EXISTSin the rating migration / 0002);X-Saturn-Base-Modelidentity header carried onto the Event inproxy.emit(completion AND pre-header-abort paths); drain store INSERT extended. Rater:PriceBook.ResolveEvent+ a second TEMPrating_derivedtable (base_model → premium-applied, quantized) with a direct-over-derivedCOALESCE, deriving only for anft:id that missed the direct join and carries a base_model. Tests:TestRater_FineTunePricesViaBaseModelOnEvent,TestRater_FineTuneWithoutBaseModelFailsLoud,TestResolveEvent_FineTuneViaBaseModel, live-PGTestIntegration_FineTunePricesViaBaseModel, and end-to-endTestE2E_FineTuneBillsAtBaseTimesPremium.event_countwidened to BIGINT end to end (SQL cast, column in 0002/Alembic/integration schema, e2e scan); deleted deadErrDerivationChain(one-hop is enforced at load); renamed the overclaimingTestConformance_SQLModelMatchesRateOracle→TestOracleModel_SelfConsistent(it runs no SQL); fixedTestLoad_MalformedYAMLFailsClosed's doc to enumerate the inconsistent-premium cases it tests; corrected the stale "pointer-not-copy" comment; fixedRateResult.TotalCostdoc ("0", not"").Gate
go build ./...,gofmt -l .(empty),go vet ./...andgo vet -tags=integration ./...,go test -race ./..., golangci-lint v1.64.8 (default + integration tags) — all clean. Live-Postgres integration and the e2e pipeline test run green with-race -tags=integration.Should land on #14's lineage. Once merged into
yaml-rater, re-run the battery to dry.