From 5c8a1ed1e00719d935a70d33ea6f7238fc027384 Mon Sep 17 00:00:00 2001 From: Ryota Yoshikawa Date: Wed, 6 May 2026 21:45:06 +0900 Subject: [PATCH] docs(readme): add Known limitations section for retention Captures the conclusion of the 2026-05 retention discussion: skaldberg has no row-level retention story today and the right move is to wait for either upstream path to open rather than ship an external Athena DELETE workaround. - iceberg-rust gaining row-level DELETE / RowDelta. PRs in flight: apache/iceberg-rust#2185 (CoW OverwriteAction), #2203 (RowDelta for MoR), #2367 (delete-files in snapshot producer). Realistic landing window 0.10 / 0.11 (~mid-2026). - AWS extending PutTableRecordExpirationConfiguration from AWS-managed tables (S3 Storage Lens / SageMaker Catalog) to customer-created S3 Tables. Mechanism exists, no roadmap signal. Storage cost growth is documented (~9 GB/year per 100 samples/s sustained, $0.025/GB-month) so the "what does deferring cost" trade-off is explicit. External Athena DELETE is mentioned as the escape hatch for users who can't wait, with the caveat that it adds an ops piece outside skaldberg's "operation-less" surface. Also folded: - `without (...)` modifier moved from the fallback list into the pushdown table (PR #37 made it SQL-pushed). - Roadmap "Open" replaced with "In progress" pointing at the Phase 9 dogfood scenario (the actual current activity). - Phase 8 "Done" entry now lists `without (...)` so the status line matches what's in the tree. --- README.md | 56 ++++++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 49 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index 8c9cde0..e01e758 100644 --- a/README.md +++ b/README.md @@ -81,11 +81,11 @@ Pushed entirely into SQL (instant + matrix paths): | `histogram_quantile(q, rate(bucket[r]))` | window-based cumulative interpolation | | ` ` (1:1 label match) | `string_agg`-keyed JOIN | | `topk(n, rate(metric[r])) [by (...)]` | ranking over the per-series rate CTE | +| `(...) without (k1, k2, ...)` | `string_agg` group key over labels minus excluded | Everything else (or richer variants) falls back to a single SQL fetch plus a Rust post-step: -- `without (...)` modifier (would need the full label set up front) - `on (...)` / `ignoring (...)` / `group_left` / `group_right` on `vec × vec` - nested aggregations (`sum(sum(...))`) and other non-selector inners - arbitrary inner expressions on `histogram_quantile` @@ -212,6 +212,49 @@ SKALDBERG_TABLE_BUCKET_ARN=arn:aws:s3tables:... \ - **Backpressure.** When the buffer reaches 256 MiB the ingest endpoints return 503 so producers retry rather than OOM the server. +## Known limitations + +### No row-level retention (waiting on upstream) + +Skaldberg has no built-in story for deleting old samples. The +`samples` table grows for as long as the server ingests, and the +server itself has no path to issue an Iceberg `DELETE`. Two upstream +paths could close this; we deliberately don't ship an external +workaround: + +1. **`iceberg-rust`** gains row-level DELETE / `RowDelta` + transaction actions. PRs in flight as of 2026-05: + [apache/iceberg-rust#2185](https://github.com/apache/iceberg-rust/pull/2185) + (CoW `OverwriteAction`), + [#2203](https://github.com/apache/iceberg-rust/pull/2203) + (`RowDelta` for MoR), + [#2367](https://github.com/apache/iceberg-rust/pull/2367) + (snapshot producer delete-files). Realistic landing window is + the 0.10 or 0.11 release (~mid-2026 if the cadence holds). +2. **AWS** extends [`PutTableRecordExpirationConfiguration`](https://docs.aws.amazon.com/AmazonS3/latest/userguide/s3-tables-record-expiration.html) + from AWS-managed tables (S3 Storage Lens / SageMaker Catalog) to + customer-created S3 Tables. The mechanism, IAM, and console UI + exist; coverage extension is not on a published roadmap. + +Storage cost without retention scales linearly with ingest +throughput — roughly **9 GB / year per 100 samples/s** sustained, +at S3 Tables's `$0.025/GB-month`. At hobby / small-prod scale +(≤1 k samples/s) the bucket footprint is single-digit dollars per +year and growing slowly. If retention is a hard requirement before +either upstream path opens, scheduling an Athena `DELETE FROM +samples WHERE timestamp < ?` via EventBridge (~$2.5/year for weekly +cleanup) is technically straightforward but adds an ops piece that +sits outside skaldberg's "operation-less" surface. + +### Other + +- `on (...)` / `ignoring (...)` / `group_left` / `group_right` on + `vec × vec`: still Rust-side, no pushdown. +- Real-Prometheus end-to-end smoke (Prometheus remote_write into + skaldberg, Grafana dashboard against `/api/v1/query_range`) + remains synthetic-data only — see the in-flight Phase 9 dogfood + scenario in `examples/grafana/`. + ## Roadmap Done: @@ -225,13 +268,12 @@ Done: - **Phase 8.** PromQL → SQL pushdown for selectors, aggregations, topk/bottomk, scalar × vector, rate-family, `(rate(...))`, `histogram_quantile(q, rate(...))`, vector × vector, - `topk(n, rate(...))` — instant + matrix paths. End-to-end verified - against a real S3 Tables bucket. + `topk(n, rate(...))`, `without (...)` modifier — instant + matrix + paths. End-to-end verified against a real S3 Tables bucket. -Open: -- `without (...)`, `on/ignoring/group_*` modifiers in SQL pushdown. -- Compaction / retention story for the `samples` and `series` tables. -- Real Prometheus connection smoke test. +In progress: +- **Phase 9.** Dogfood: emit synthetic metrics into skaldberg, view + them through Grafana, fix what breaks. ## License