feat(BA-5936): add Pruner repository abstraction#11460
Draft
feat(BA-5936): add Pruner repository abstraction#11460
Conversation
fregataa
added a commit
that referenced
this pull request
May 1, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a new repository-layer “Pruner” abstraction for race-safe bulk deletions, aligned with existing Creator/Updater/Purger patterns, plus integration tests exercising cascading deletes and optional RBAC association cleanup.
Changes:
- Introduce
PrunerSpec,CascadeChild,PrunerResult, andexecute_pruner()to performSELECT ... FOR UPDATE-based pruning with optional cascades. - Add RBAC association cleanup support via
PrunerSpec.entity_type()+cascade_rbac. - Add real-DB integration tests covering basic pruning, runtime conditions, limit behavior, FK cascades, and RBAC cleanup.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
src/ai/backend/manager/repositories/base/pruner.py |
New pruner abstraction + executor implementing lock-then-delete flow with cascade support. |
src/ai/backend/manager/repositories/base/__init__.py |
Re-export new pruner APIs from the base repository package. |
tests/unit/manager/repositories/base/test_pruner.py |
Integration tests validating pruning, cascade deletes, and RBAC cleanup behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Introduces a declarative spec for bulk-delete (prune) operations parallel to the existing Creator/Updater/Purger patterns: - PrunerSpec[TRow] abstract base with row_class/returning_id/prune_condition classmethods plus per-call conditions/cascade/limit/cascade_rbac fields. - CascadeChild abstraction for FK-dependent child tables (e.g., kernels of pruned sessions) deleted within the same transaction. - entity_type() classmethod opts the spec into RBAC association cleanup via association_scopes_entities (cross-cutting, built into the executor). - execute_pruner runs SELECT pk FOR UPDATE LIMIT once, materializes IDs, then issues cascade DELETEs and the parent DELETE...RETURNING from the same locked set — race-safe and avoids re-evaluating the parent SELECT per cascade. - DEFAULT_PRUNE_LIMIT (100k) caps lock set, memory, and transaction duration; operators run multiple calls to drain larger backlogs. Includes integration tests covering basic prune, runtime conditions, limit, FK cascade, RBAC cleanup with entity_type filtering, and the default no-op (entity_type=None) path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Wrap all DELETEs (cascade, RBAC, parent) in a single try/except for IntegrityError so cascade failures also surface as RepositoryIntegrityError instead of raw SQLAlchemy errors. - Drop PrunerSpec.returning_id() — derive the parent PK column directly from row_class().__table__.primary_key, mirroring the Purger pattern. Reject composite-PK tables with UnsupportedCompositePrimaryKeyError. - Update CascadeChild docstring to reflect the materialized-id-list approach (the previous SQL-subquery wording was stale). - Tighten test_no_cascade_with_fk_violation_raises to expect ForeignKeyViolationError, also covering the parse_integrity_error path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Parameterize CascadeChild on the cascade table's Row class (``CascadeChild[TRow: Base]``) for consistency with PrunerSpec[TRow] and to give subclass authors precise typing on ``row_class()`` — e.g., ``CascadeChild[KernelRow]`` makes ``row_class()`` return ``type[KernelRow]`` instead of the over-broad ``type[Base]``. PrunerSpec.cascade is typed as ``list[CascadeChild[Any]]`` since a spec composes cascades over heterogeneous tables. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
8fb7909 to
2324c77
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PrunerSpec[TRow]abstraction parallel to existing Creator/Updater/Purger patterns. Subclasses declaratively encode the entity's prune contract viarow_class(),returning_id(),prune_condition(), and optionalentity_type()classmethods; per-call params (conditions,cascade,limit,cascade_rbac) live on the instance.CascadeChildabstraction handles FK-dependent child tables (e.g., kernels of pruned sessions);entity_type()opts into RBAC association cleanup baked intoexecute_prunerfor cross-cuttingassociation_scopes_entitiessemantics.execute_prunerrunsSELECT pk FOR UPDATE LIMITonce, materializes IDs, then issues cascade DELETEs and the parentDELETE...RETURNINGfrom the same locked set — race-safe and avoids re-evaluating the parent SELECT per cascade. Hard-capped atDEFAULT_PRUNE_LIMIT = 100_000to bound lock set, memory, and transaction duration.Test plan
pants test tests/unit/manager/repositories/base/test_pruner.py— 11 tests covering basic prune, runtime conditions, limit, FK cascade, RBAC cleanup with entity_type filtering, and the default no-op (entity_type=None) path.pants fmt,pants fix,pants lint,pants checkall pass on changed files.Notes
This adds the abstraction only. Concrete per-entity Pruner specs (session, vfolder, etc.) and the v2 REST
pruneendpoint follow as separate PRs under epic BA-5935.Resolves BA-5936
🤖 Generated with Claude Code