feat: opt-out with_counts param + capped COUNT(*) for paginated list endpoints#1233
Draft
feat: opt-out with_counts param + capped COUNT(*) for paginated list endpoints#1233
Conversation
Add a with_counts query parameter to LimitOffsetPaginationWithPermissions. When with_counts is not provided or set to false (the default), the expensive COUNT(*) query is skipped and count is returned as null. A limit+1 fetch strategy is used to determine next/previous links without needing the full count. Existing tests that asserted on the count value are updated to pass with_counts=true explicitly. Co-Authored-By: Claude <noreply@anthropic.com> Agent-Logs-Url: https://github.com/RolnickLab/antenna/sessions/08338cc2-3ec7-4991-b383-ddba7fc5f357 Co-authored-by: mihow <158175+mihow@users.noreply.github.com>
- Move replace_query_param and remove_query_param imports to top-level - Use remove_query_param when offset is 0 in get_previous_link - Narrow exception catch to ValidationError instead of bare Exception Co-Authored-By: Claude <noreply@anthropic.com> Agent-Logs-Url: https://github.com/RolnickLab/antenna/sessions/08338cc2-3ec7-4991-b383-ddba7fc5f357 Co-authored-by: mihow <158175+mihow@users.noreply.github.com>
When with_counts=true is requested, run a bounded COUNT(*) via a LIMIT N+1 subquery instead of a full table scan. Result sets ≤ LARGE_QUERYSET_THRESHOLD (10,000) get an exact count; larger ones fall back to count:null with probe-based next/previous links. Co-Authored-By: Claude <noreply@anthropic.com> Agent-Logs-Url: https://github.com/RolnickLab/antenna/sessions/cf5a994a-a9df-4c62-80c3-2808f589bbcc Co-authored-by: mihow <158175+mihow@users.noreply.github.com>
Copilot created this pull request from a session on behalf of
mihow
April 15, 2026 02:01
View session
✅ Deploy Preview for antenna-preview canceled.
|
✅ Deploy Preview for antenna-ssec canceled.
|
with_counts=true requests…only Flip the default so existing callers (and the React UI) keep receiving `count` exactly as before. Callers that don't need the total can opt out with `?with_counts=false`. The capped COUNT(*) safety valve still applies on the default path: result sets that exceed LARGE_QUERYSET_THRESHOLD return `count: null` and probe-based next/previous links. A follow-up PR will: - Update React Query hooks and PaginationBar to tolerate `count: null` - Switch list pages to request counts only when needed (e.g. via a second `with_counts=true` call to populate "showing N of total") - Flip the server default to `with_counts=false` Tests updated to assert that the default response now includes an integer count, with explicit opt-in/opt-out coverage and a fallback test for the threshold path. Co-Authored-By: Claude <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
List endpoints currently run
SELECT COUNT(*)on every paginated request, which becomes the dominant cost on large tables (Occurrence,Detection,SourceImage) even when the page query itself is fast and well-indexed. This PR introduces awith_countsquery parameter so callers can opt out of the count when they don't need it, and adds a capped-count safety valve so the worst-case scan is bounded even on the default code path.No behavior changes for existing callers. The default is
with_counts=true, which preserves DRF's current response shape (countis always returned). A follow-up PR will update the React UI to handlecount: nulland then flip the server default.Motivation
Composite indexes on
Occurrence(determination, project, event, score) andSourceImage(deployment, timestamp) make page queries fast. ButCOUNT(*)over the filtered result set can't use those indexes effectively, and on large projects it dominates total list-endpoint latency. Most UI views don't need the exact total — they need "is there a next page?" — so we end up paying for something we mostly throw away.Changes
All changes are in
ami/base/pagination.py(LimitOffsetPaginationWithPermissions, the project-wide default paginator declared inconfig/settings/base.py:483).with_countsquery parameter (defaulttrue).with_counts=true(default): the response includescountexactly as before.with_counts=false: skips the COUNT(*) entirely. The paginator fetcheslimit + 1rows and uses the extra row to decide whether to return anextlink. The response payload returnscount: null.Capped COUNT(*) on the default path.
Even when counting, we wrap the queryset in
queryset[:LARGE_QUERYSET_THRESHOLD + 1].count(), which Postgres plans asSELECT COUNT(*) FROM (SELECT … LIMIT N) sub. This is O(N) regardless of total table size.LARGE_QUERYSET_THRESHOLDdefaults to 10,000. If the capped count hits the threshold, the response falls back tocount: nullwith probe-basednext/previouslinks. This protects every list endpoint from a runaway count without requiring callers to opt in.Probe-based
next/previouslink computation when count is absent, so the pagination contract (count,next,previous,results) is preserved in either mode.OpenAPI schema marks
countas nullable (it can benullwhen callers opt out, or when the capped count hits the threshold).What is NOT in this PR (planned follow-ups)
total: data?.count ?? 0andPaginationBarcomputesnumPagesfromtotal. A follow-up PR will:count: null(e.g. show "Page N" without a total, or show "Showing 1–10" without a "of M").with_counts=trueto populate "showing N of total" lazily, on demand.with_counts=false.default_with_countsclass attribute so small, cheap list endpoints (projects, pipelines, processing services) keep returning counts by default even after the global default flips.pg_class.reltuplesfuzzy counts for unfiltered tables, denormalized cached counts onProject/Deployment(already exists for some), materialized views.Test plan
ami/main/tests.py::TestPaginationWithCounts:with_counts=truereturns integer countwith_counts=falsereturnscount: nullwith_counts=false:next/previouscorrect on first/middle/last pagecount: nullwith working links/api/v2/occurrences/?project_id=18with and withoutwith_counts=falseand add numbers here before merge.django-cachalotstill caches the capped-count subquery on the default path.main.Rollout
with_countsparameter and capped COUNT(*) safety valve. Default =true. No UI changes required.useOccurrences,useCaptures, … andPaginationBarto handlecount: null(likely with an explicit "Show total" affordance or a deferred second call).with_counts=false.