Skip to content

feat: opt-out with_counts param + capped COUNT(*) for paginated list endpoints#1233

Draft
Copilot wants to merge 4 commits intomainfrom
copilot/add-query-param-with-counts
Draft

feat: opt-out with_counts param + capped COUNT(*) for paginated list endpoints#1233
Copilot wants to merge 4 commits intomainfrom
copilot/add-query-param-with-counts

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 15, 2026

Summary

List endpoints currently run SELECT COUNT(*) on every paginated request, which becomes the dominant cost on large tables (Occurrence, Detection, SourceImage) even when the page query itself is fast and well-indexed. This PR introduces a with_counts query parameter so callers can opt out of the count when they don't need it, and adds a capped-count safety valve so the worst-case scan is bounded even on the default code path.

No behavior changes for existing callers. The default is with_counts=true, which preserves DRF's current response shape (count is always returned). A follow-up PR will update the React UI to handle count: null and then flip the server default.

Motivation

Composite indexes on Occurrence (determination, project, event, score) and SourceImage (deployment, timestamp) make page queries fast. But COUNT(*) over the filtered result set can't use those indexes effectively, and on large projects it dominates total list-endpoint latency. Most UI views don't need the exact total — they need "is there a next page?" — so we end up paying for something we mostly throw away.

Changes

All changes are in ami/base/pagination.py (LimitOffsetPaginationWithPermissions, the project-wide default paginator declared in config/settings/base.py:483).

  1. with_counts query parameter (default true).

    • with_counts=true (default): the response includes count exactly as before.
    • with_counts=false: skips the COUNT(*) entirely. The paginator fetches limit + 1 rows and uses the extra row to decide whether to return a next link. The response payload returns count: null.
  2. Capped COUNT(*) on the default path.
    Even when counting, we wrap the queryset in queryset[:LARGE_QUERYSET_THRESHOLD + 1].count(), which Postgres plans as SELECT COUNT(*) FROM (SELECT … LIMIT N) sub. This is O(N) regardless of total table size. LARGE_QUERYSET_THRESHOLD defaults to 10,000. If the capped count hits the threshold, the response falls back to count: null with probe-based next/previous links. This protects every list endpoint from a runaway count without requiring callers to opt in.

  3. Probe-based next/previous link computation when count is absent, so the pagination contract (count, next, previous, results) is preserved in either mode.

  4. OpenAPI schema marks count as nullable (it can be null when callers opt out, or when the capped count hits the threshold).

What is NOT in this PR (planned follow-ups)

  • Frontend updates. React Query hooks currently read total: data?.count ?? 0 and PaginationBar computes numPages from total. A follow-up PR will:
    • Teach those components to tolerate count: null (e.g. show "Page N" without a total, or show "Showing 1–10" without a "of M").
    • Optionally make a second call with with_counts=true to populate "showing N of total" lazily, on demand.
    • Then flip the server default to with_counts=false.
  • Per-view override. A follow-up may add a default_with_counts class attribute so small, cheap list endpoints (projects, pipelines, processing services) keep returning counts by default even after the global default flips.
  • Other count strategies. Out of scope here: pg_class.reltuples fuzzy counts for unfiltered tables, denormalized cached counts on Project/Deployment (already exists for some), materialized views.

Test plan

  • Unit tests in ami/main/tests.py::TestPaginationWithCounts:
    • default response includes integer count
    • with_counts=true returns integer count
    • with_counts=false returns count: null
    • with_counts=false: next/previous correct on first/middle/last page
    • threshold fallback returns count: null with working links
  • Measure cold-query latency on /api/v2/occurrences/?project_id=18 with and without with_counts=false and add numbers here before merge.
  • Confirm django-cachalot still caches the capped-count subquery on the default path.
  • Spot-check at least one paginated UI page against a server with this PR — count behavior should be identical to main.

Rollout

  1. This PR: add the with_counts parameter and capped COUNT(*) safety valve. Default = true. No UI changes required.
  2. Follow-up PR: update useOccurrences, useCaptures, … and PaginationBar to handle count: null (likely with an explicit "Show total" affordance or a deferred second call).
  3. Follow-up PR: flip server-side default to with_counts=false.

Copilot AI and others added 3 commits April 14, 2026 19:07
Add a with_counts query parameter to LimitOffsetPaginationWithPermissions.
When with_counts is not provided or set to false (the default), the
expensive COUNT(*) query is skipped and count is returned as null.
A limit+1 fetch strategy is used to determine next/previous links
without needing the full count.

Existing tests that asserted on the count value are updated to pass
with_counts=true explicitly.

Co-Authored-By: Claude <noreply@anthropic.com>

Agent-Logs-Url: https://github.com/RolnickLab/antenna/sessions/08338cc2-3ec7-4991-b383-ddba7fc5f357

Co-authored-by: mihow <158175+mihow@users.noreply.github.com>
- Move replace_query_param and remove_query_param imports to top-level
- Use remove_query_param when offset is 0 in get_previous_link
- Narrow exception catch to ValidationError instead of bare Exception

Co-Authored-By: Claude <noreply@anthropic.com>

Agent-Logs-Url: https://github.com/RolnickLab/antenna/sessions/08338cc2-3ec7-4991-b383-ddba7fc5f357

Co-authored-by: mihow <158175+mihow@users.noreply.github.com>
When with_counts=true is requested, run a bounded COUNT(*) via a LIMIT N+1
subquery instead of a full table scan. Result sets ≤ LARGE_QUERYSET_THRESHOLD
(10,000) get an exact count; larger ones fall back to count:null with
probe-based next/previous links.

Co-Authored-By: Claude <noreply@anthropic.com>

Agent-Logs-Url: https://github.com/RolnickLab/antenna/sessions/cf5a994a-a9df-4c62-80c3-2808f589bbcc

Co-authored-by: mihow <158175+mihow@users.noreply.github.com>
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 15, 2026

Deploy Preview for antenna-preview canceled.

Name Link
🔨 Latest commit 0b19045
🔍 Latest deploy log https://app.netlify.com/projects/antenna-preview/deploys/69e2c6bea5bb7300081f5b82

@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 15, 2026

Deploy Preview for antenna-ssec canceled.

Name Link
🔨 Latest commit 0b19045
🔍 Latest deploy log https://app.netlify.com/projects/antenna-ssec/deploys/69e2c6be395b73000850490a

Copilot AI requested a review from mihow April 15, 2026 02:04
@mihow mihow changed the title feat: capped COUNT(*) safety valve for with_counts=true requests feat: speed up list views by deferring big counts Apr 15, 2026
…only

Flip the default so existing callers (and the React UI) keep receiving
`count` exactly as before. Callers that don't need the total can opt out
with `?with_counts=false`. The capped COUNT(*) safety valve still applies
on the default path: result sets that exceed LARGE_QUERYSET_THRESHOLD
return `count: null` and probe-based next/previous links.

A follow-up PR will:
- Update React Query hooks and PaginationBar to tolerate `count: null`
- Switch list pages to request counts only when needed (e.g. via a
  second `with_counts=true` call to populate "showing N of total")
- Flip the server default to `with_counts=false`

Tests updated to assert that the default response now includes an
integer count, with explicit opt-in/opt-out coverage and a fallback
test for the threshold path.

Co-Authored-By: Claude <noreply@anthropic.com>
@mihow mihow changed the title feat: speed up list views by deferring big counts feat: opt-out with_counts param + capped COUNT(*) for paginated list endpoints Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants