Phoebe: rate from YAML price config (E1) by hhuuggoo · Pull Request #14 · saturncloud/phoebe

hhuuggoo · 2026-06-15T18:31:37Z

Reworks phoebe's rater to price from a versioned, operator-authored YAML price file instead of the DB price tables, per the locked E1–E4 decisions (token-factory-rating-atlas-decisions.md). The rater loads the file at run start, projects the resolved per-token rates into a transient TEMP table, rates the last complete hour entirely in SQL, and freezes the applied rate onto each rated_usage row so the row is self-auditing and immutable.

Money discipline is unchanged: NUMERIC throughout, cost computed + summed in SQL, the cached-subset billable-prompt formula, fail-loud-never-$0, idempotent deterministic-id upsert, session-TZ-independent bucketing, and the Rate() oracle + live-Postgres conformance pinning the SQL.

Contracts

1. The price file — the new operator-facing contract (`config/prices.example.yaml`)

Prices are a versioned YAML file the operator authors and git-tracks; the file's history IS the price audit trail (no price table, no effective-dating, no price-management UI). Rates are exact-decimal strings, never float. The loader fails closed on anything malformed (missing file, bad YAML, unknown version, float-shaped/negative rate, missing component, inconsistent premium, dangling derived_from).

version: 1

base_models:                                  # keyed on the HF model id (E3)
  "meta-llama/Llama-3.1-8B-Instruct":
    prompt:     "0.000000200"                 # per-token USD, exact decimal string
    cached:     "0.000000050"                 # distinct, discounted; cached ⊆ prompt
    completion: "0.000000600"

fine_tune_premium:                            # the SINGLE global premium policy
  policy: multiplier                          # identity | multiplier | markup
  factor: "1.5"                               # set iff multiplier
  # markup: "0.000000100"                     # set iff markup (per-token USD)

fine_tunes:                                   # OPTIONAL ft:<checkpoint> entries (E3)
  "ft:1f0c2d3e4a5b6c7d8e9f0a1b2c3d4e5f":
    derived_from: "meta-llama/Llama-3.1-8B-Instruct"   # base × premium (one hop)
  # an ft may instead carry its own `rate:` (escape hatch; bypasses the premium)

gpu_floor_rates:                              # per-GPU floor (uptime meter later;
  "A100-80GB": "0.000000000"                  # PARSED + VALIDATED now, not yet wired)
  "H100-80GB": "0.000000000"

Resolution mirrors the SQL exactly: own rate wins (base, or an ft with its own rate); else an ft's derived_from base × the global premium (one hop); else ErrNoPrice (never $0). The premium is applied to the exact base rate, then the final per-token rate is quantized to 9dp — the rate that bills and the rate stored on the row are bit-identical, so cost is always reconstructable from the row.

2. `rated_usage` — applied-rate columns added

applied_prompt_rate     NUMERIC(20,9) NOT NULL DEFAULT 0,
applied_cached_rate     NUMERIC(20,9) NOT NULL DEFAULT 0,
applied_completion_rate NUMERIC(20,9) NOT NULL DEFAULT 0,

The exact per-token rates each rollup was billed at, frozen onto the row from the file the run loaded. The row is then immutable and self-auditing — "we never reprice traffic you've already served" holds by construction; re-rating is a deliberate, audited re-run.

3. Dropped: `model_price` + `derivation_policy`

The whole temporal price-book apparatus is gone — both tables, the btree_gist GiST exclusion constraints, effective-dating, and the SQL price seed (seed_example_prices.sql). Prices are config now.

Migration approach: clean rewrite of 0002_rating.sql + atlas/c2f1a3b4d5e6_add_rating.py (create only rated_usage with the applied-rate columns). Justified because there is no prod data and the Alembic file was never copied into saturn/alembic (it's a ready-to-copy artifact maintained here). The Alembic docstring flags that if anyone has applied a model_price/derivation_policy version of this revision, they must add a follow-up drop+alter instead.

Flagged gap — fine-tune base linkage

billing_event carries only the engine-reported model name (no derived_from/base_model column). So a fine-tune's base is not plumbed to the rater. Base-direct models price fully today. An ft:<checkpoint> id prices only if the file declares its derived_from (or own rate); otherwise it is unpriced — fail loud, never $0 (tested). Closing the gap means the metering path stamping the base (saturn.io/...base_model) onto the event, or shipping a fine-tune→base map in the file. The premium machinery is complete and tested; only the linkage source is pending.

S3 seam (out of scope, left clean)

The price file loads from a local path (-prices flag / priceFile setting). LoadPriceBook(localPath) is the seam: fetch-from-S3-to-local then load. The create-time price gate (E4) and the rater must read the same file/version — a single fetched copy is the shared artifact.

Tests (all named; full gate + live-PG green)

yaml-base-price-applied, yaml-fine-tune-premium-multiplier, yaml-fine-tune-premium-markup, fine-tune-identity/own-rate-bypass, missing-price-fails-loud-not-zero, applied-rate-stored-on-row, cached-subset-not-double-count, numeric-exactness-no-float, idempotent-rerun, malformed-yaml-fails-closed (16 sub-cases), nil-book-fails-closed, gpu-floor parse, example-file-valid. Integration (live PG): RateWindow_ConformsToOracle (asserts the applied-rate columns), PremiumQuantizedBeforeBilling (self-audit: stored 9dp rate reconstructs cost), UTC-bucketing, index-serves-scan, and the e2e pipeline test (price source swapped to a YAML book).

Gate: go build, go test -race ./..., go vet (+ -tags=integration), golangci-lint v1.64.8, gofmt -l — all clean. Live-Postgres -tags=integration (incl. -race) — all green.

🤖 Generated with Claude Code

Move prices off DB tables (model_price + derivation_policy) onto a versioned operator-authored YAML price file. The rater loads the file at run start, projects the resolved per-token rates (fine-tune premium applied in exact decimal) into a transient TEMP table, rates the last complete hour entirely in SQL, and FREEZES the applied per-token rates onto each rated_usage row so the row is self-auditing and immutable. Contracts: - New operator-facing price file (config/prices.example.yaml): base per-token rates keyed on the HF model id, the single global fine-tune premium policy (identity|multiplier|markup), and per-GPU floor rates. Rates are exact-decimal strings (never float). Loader fails closed on anything malformed. - rated_usage gains applied_prompt_rate / applied_cached_rate / applied_completion_rate NUMERIC(20,9): the exact rate each rollup billed at. - Dropped model_price + derivation_policy (and the GiST exclusion constraints, effective-dating, and the SQL seed). Clean rewrite of the 0002 migration + Alembic (no prod data; the Alembic was never applied to saturn). Keeps the money discipline: NUMERIC throughout, cost computed+summed in SQL, cached-subset billable-prompt formula, fail-loud-never-$0 (ErrNoPrice / unpriced count), idempotent deterministic-id upsert, session-TZ-independent bucketing, and the Rate() oracle + live-Postgres conformance pinning the SQL. Fine-tune base linkage is a flagged gap: billing_event carries only the engine model NAME, so an ft:<checkpoint> id prices only if the file declares its derived_from (or own rate); otherwise it is unpriced (fail loud). Base-direct models price fully. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

hhuuggoo · 2026-06-15T18:46:10Z

🔋 Battery review — Round 1 (status: ESCALATE)

Tier-2 money-path review of the YAML-rater rework. 19 raw → 4 refuted → 15 confirmed + 7 persona. Stopped on design/money decisions that must not be auto-patched. Two need Hugo; two are confirm-the-non-goal; the rest are mechanical (held back).

🛑 Money decisions for Hugo

1. Rounding model changed — quantize-then-multiply vs sum-then-round (this PR flipped it). Merged main summed exact per-event products and rounded once. This PR quantizes each per-token rate to 9dp first, then multiplies — because E1's "store the applied rate on the row" requires a 9dp rate that reconstructs the cost. For a fine-tune whose premium yields a sub-nano residue (e.g. 1-nano base × 1.5 = 0.0000000015 → rounds to 0.000000002), quantize-then-multiply bills slightly more. This is a forced consequence of the store-rate-on-row decision — sum-then-round is incompatible with self-auditing rows. The only actual defect: doc.go still documents sum-then-round, so code and stated contract disagree. Decision: ratify quantize-then-multiply as the spec (recommended — it's what self-auditing rows require) and fix the docs + oracle to match; OR keep sum-then-round and drop the rate-on-row guarantee. The former is consistent with your E1 call.

2. Effective-dated pricing removed — confirm the late-arrival semantics. The price table + GiST + effective-dating are gone (intended — prices are YAML now). "Never reprice served traffic" now holds because the row freezes its applied rate, not because price is resolved as-of-event-time. Consequence: a late-arriving event in an already-rated hour, re-rated after a YAML price change, bills at the new rate, not the rate when it was served. Given the rater runs hourly and prices change rarely, the window is tiny — but confirm this is acceptable (it's the same call you reasoned through when choosing rate-on-row).

🛑 Confirm-the-non-goal (self-flagged, not bugs)

Fine-tune base-linkage gap — an ft:<checkpoint> whose derived_from isn't in the YAML → ErrNoPrice (fail-loud, correct). The rater structurally can't price the primary fine-tune id format until the metering path plumbs base_model through. Confirm this is an accepted v1 non-goal before merge. (This is the (a)-vs-(b) call already on your plate.)

✅ Mechanical (held back — will fix once decisions land)

pricebook.go:362 (CONFIRMED/high): a base rate finer than 9dp is silently rounded — possibly to 0. A price like 0.0000000001 becomes 0.000000000. Needs a load-time guard: reject sub-9dp rates that round to zero (fail-closed on a price the operator clearly meant to be nonzero).
The conformance oracle feeds the unquantized rate while production bills the quantized one (latent — current fixture has no residue, but the conformance guard is mis-calibrated for the day one appears). Fix: oracle uses rate.Quantized().
event_count is int/INTEGER again (the BIGINT widening didn't carry into the rewrite); two overclaiming test names; dead ErrDerivationChain; stale "pointer-not-copy" comment.

Battery wf_d0f65d1b-dc7, 25 agents, ~1.2M tokens. #14 is NOT merge-ready: it needs decision 1 (rounding spec) + the high-sev sub-9dp-price guard, then a fix pass + re-battery to dry.

hhuuggoo · 2026-06-16T01:03:02Z

Landed on main via the squashed rating merge (3b22908). Closing the stack.

This was referenced Jun 15, 2026

Rating fix pass: ratify quantize-then-multiply, sub-9dp guard, base_model plumbing (#14 battery + decisions) #15

Closed

Phoebe: D1 truncation logging + D2 event_count BIGINT (Hugo's decisions) #13

Closed

hhuuggoo closed this Jun 16, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phoebe: rate from YAML price config (E1)#14

Phoebe: rate from YAML price config (E1)#14
hhuuggoo wants to merge 1 commit into
mainfrom
yaml-rater

hhuuggoo commented Jun 15, 2026

Uh oh!

hhuuggoo commented Jun 15, 2026

Uh oh!

hhuuggoo commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

hhuuggoo commented Jun 15, 2026

Contracts

1. The price file — the new operator-facing contract (config/prices.example.yaml)

2. rated_usage — applied-rate columns added

3. Dropped: model_price + derivation_policy

Flagged gap — fine-tune base linkage

S3 seam (out of scope, left clean)

Tests (all named; full gate + live-PG green)

Uh oh!

hhuuggoo commented Jun 15, 2026

🔋 Battery review — Round 1 (status: ESCALATE)

🛑 Money decisions for Hugo

🛑 Confirm-the-non-goal (self-flagged, not bugs)

✅ Mechanical (held back — will fix once decisions land)

Uh oh!

hhuuggoo commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. The price file — the new operator-facing contract (`config/prices.example.yaml`)

2. `rated_usage` — applied-rate columns added

3. Dropped: `model_price` + `derivation_policy`