fix(counts): refresh deployment cached counts in pipeline.save_results by mihow · Pull Request #1243 · RolnickLab/antenna

mihow · 2026-04-17T05:41:43Z

Summary

save_results() in ami/ml/models/pipeline.py refreshes
update_calculated_fields_for_events for the batch's events but never
the parent Deployment. As a result, deployment.occurrences_count
and deployment.taxa_count stay stale until something else (a manual
deployment.save()) runs. This is the user-reported bug "Station
counts for occurrences and taxa are not always getting updated" —
Stations in the UI are Deployments, and their cached counts simply
don't move after an ML job processes new images.

The patch adds a per-batch deployment refresh next to the existing
event refresh:

Fires per save_results() batch, so it covers every result-write
path: sync MLJob.process_images, async NATS
process_nats_pipeline_result, and the Celery batch wrapper in
ami/ml/tasks.py.
Provides incremental updates during long jobs rather than only at
job-end, since each batch refreshes the deployments it touched.
Co-located with the existing event refresh — no new infrastructure,
no signals, no new tasks.

Test plan

New regression test TestSaveResultsRefreshesDeploymentCounts
in ami/ml/tests.py exercises save_results end-to-end with a real
Project + Deployment + Event + SourceImage and asserts both
occurrences_count and taxa_count move off zero.
Verified test FAILS on origin/main (AssertionError: 0 not greater than 0) — proves the bug.
Verified test PASSES with the patch applied.
Manual smoke: trigger an ML job from the UI on a fresh
collection and confirm Station/Deployment counts update without
needing a manual save.

Notes

Considered but rejected: a post_save signal on Job SUCCESS that
enqueues a Celery task to walk job → images → deployments. That
approach was ~300 lines and would only refresh at job-end. The
in-save_results fix is ~3 lines, fires per-batch, and reuses
existing infrastructure.
Does not conflict with PR Default taxa filter follow-up: handle cached counts and add default taxa to projects #1045
(fix/default-filters-followup-default-taxa-and-cache) which
touches ami/main/signals.py / ami/main/tasks.py for project
default-filter changes — different files, different scope.

Summary by CodeRabbit

Release Notes

Bug Fixes
- Deployment occurrence and taxa counts now update automatically when processing pipeline results.

save_results() refreshed update_calculated_fields_for_events for the batch but never the parent Deployment, leaving deployment.occurrences_count and deployment.taxa_count stale until something else (a manual deployment.save) ran. Reproduced as the "Station counts for occurrences and taxa are not always getting updated" report. Adds a per-batch deployment refresh next to the existing event refresh, so it covers every result-write path (sync MLJob.process_images, async NATS process_nats_pipeline_result, and the Celery batch wrapper) and fires incrementally during long jobs instead of only at the end. Includes a regression test that exercises save_results end-to-end with a real deployment + event + image and asserts the cached counts move off zero. Verified to fail on origin/main and pass with the patch. Co-Authored-By: Claude <noreply@anthropic.com>

netlify · 2026-04-17T05:41:48Z

✅ Deploy Preview for antenna-preview canceled.

Name	Link
🔨 Latest commit	`40e9368`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-preview/deploys/69e1ce09dacd8b000843a3bc

netlify · 2026-04-17T05:41:48Z

✅ Deploy Preview for antenna-ssec canceled.

Name	Link
🔨 Latest commit	`40e9368`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-ssec/deploys/69e1ce09480826000892fcd6

coderabbitai · 2026-04-17T05:41:51Z

📝 Walkthrough

Walkthrough

The save_results method in the pipeline model now recalculates precomputed fields for associated Deployment records after processing pipeline results. A corresponding test class verifies that deployment occurrence and taxa counts are properly refreshed.

Changes

Cohort / File(s)	Summary
Deployment Count Refresh Logic `ami/ml/models/pipeline.py`	Added logic to `save_results` that fetches `Deployment` records associated with processed source images and calls `update_calculated_fields(save=True)` on each to refresh precomputed counts.
Test Coverage `ami/ml/tests.py`	Added imports for `Deployment` and `Event` models. Introduced `TestSaveResultsRefreshesDeploymentCounts` test class with setup, helper method `_fake_results()` that generates test pipeline results, and test method `test_deployment_counts_refresh_after_save_results` that verifies occurrence and taxa counts are updated after saving results.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 A pipeline runs and counts do flow,
Deployments wake from slumber's glow,
With tests to guard each field so dear,
The rabbits hop with joy and cheer! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix(counts): refresh deployment cached counts in pipeline.save_results' directly and specifically describes the main change—refreshing deployment counts in the save_results method to fix the stale count bug.
Description check	✅ Passed	The description covers Summary, List of Changes, Related Issues, Detailed Description with side effects, and How to Test, matching the template requirements. Deployment Notes and Screenshots are not applicable.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/refresh-counts-after-ml-job

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

Fixes stale cached counts on Deployment (Stations in the UI) by ensuring save_results() refreshes the parent deployments’ calculated fields after writing ML results, alongside the existing per-event refresh. This addresses the reported issue where deployment.occurrences_count / deployment.taxa_count don’t update after ML processing until a later manual save.

Changes:

Refresh Deployment calculated fields for deployments touched by a save_results() batch.
Add a regression test that reproduces the stale deployment counts and verifies they update after save_results().

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`ami/ml/models/pipeline.py`	Refreshes calculated fields on affected deployments after saving detections/classifications/occurrences.
`ami/ml/tests.py`	Adds an end-to-end regression test asserting deployment cached counts increase after `save_results()`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (2)

ami/ml/models/pipeline.py (2)
998-1001: Refresh is correct; note the downstream cost.

Deployment.update_calculated_fields(save=True) calls self.save(update_calculated_fields=False), which short-circuits recursion and also skips update_children() / regroup_events — so this is a minimal, targeted refresh per batch, as intended. Good.

Be aware that update_calculated_fields itself re-runs self.captures.count(), data_source_total_size aggregate, get_detections_count(), and two occurrences queries with apply_default_filters + a values("determination_id").distinct().count(). For large deployments processed in many small batches (NATS/Celery wrapper paths), this will run on every batch. If that becomes a hotspot, a lighter "counts-only" path (or debouncing per deployment within a job) is worth considering — not a blocker here.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ami/ml/models/pipeline.py` around lines 998 - 1001, The current loop collects
deployment_ids from source_images and calls
deployment.update_calculated_fields(save=True) for each Deployment in
Deployment.objects.filter(pk__in=deployment_ids), which triggers several costly
aggregates and queries on every small batch; to mitigate downstream cost, either
add and call a lighter "counts-only" or debounced path on Deployment (e.g., a
new method like update_counts_only or
update_calculated_fields(counts_only=True)) or implement per-job
debouncing/coalescing so each deployment_id is refreshed at most once per job;
update references: source_images, deployment_ids,
Deployment.objects.filter(...), and deployment.update_calculated_fields(...)
when making the change.
998-1000: Consider isolating per-deployment refresh failures.

If any single deployment's update_calculated_fields(save=True) raises (DB hiccup, stale FK, signal side-effect, etc.), the entire save_results Celery task fails after detections/classifications/occurrences have already been persisted — counts are now stale and the job is marked failed. Since this refresh is a best-effort cache update (the underlying data is already correct), a per-deployment try/except that logs and continues would be more resilient, and keeps parity with how update_calculated_fields_for_events above is fire-and-forget.
Proposed change
     deployment_ids = {img.deployment_id for img in source_images if img.deployment_id}
     for deployment in Deployment.objects.filter(pk__in=deployment_ids):
-        deployment.update_calculated_fields(save=True)
+        try:
+            deployment.update_calculated_fields(save=True)
+        except Exception:
+            job_logger.exception(
+                f"Failed to refresh calculated fields for deployment {deployment.pk}; continuing"
+            )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@ami/ml/models/pipeline.py` around lines 998 - 1000, The loop that calls
deployment.update_calculated_fields(save=True) inside the save_results Celery
task should be made resilient to per-deployment failures: instead of letting any
exception abort the whole task, wrap each
deployment.update_calculated_fields(save=True) call in a try/except that logs
the exception (use the module/task logger via logger.exception or similar) and
continues to the next deployment; keep the existing logic that builds
deployment_ids ({img.deployment_id for img in source_images}) and the
Deployment.objects.filter(...) iterator, but ensure failures are handled
per-deployment so a single flaky update_calculated_fields call doesn’t fail the
entire save_results task (match the fire-and-forget resiliency used by
update_calculated_fields_for_events).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@ami/ml/tests.py`:
- Around line 1424-1431: The test uses a naive datetime for event_time which
triggers Django warnings when USE_TZ=True; make event_time timezone-aware (e.g.,
use datetime.datetime(..., tzinfo=datetime.timezone.utc) or wrap with
django.utils.timezone.make_aware) before calling Event.objects.create so
Event.start and Event.end (and any SourceImage.timestamp usages) are saved with
an explicit timezone.

---

Nitpick comments:
In `@ami/ml/models/pipeline.py`:
- Around line 998-1001: The current loop collects deployment_ids from
source_images and calls deployment.update_calculated_fields(save=True) for each
Deployment in Deployment.objects.filter(pk__in=deployment_ids), which triggers
several costly aggregates and queries on every small batch; to mitigate
downstream cost, either add and call a lighter "counts-only" or debounced path
on Deployment (e.g., a new method like update_counts_only or
update_calculated_fields(counts_only=True)) or implement per-job
debouncing/coalescing so each deployment_id is refreshed at most once per job;
update references: source_images, deployment_ids,
Deployment.objects.filter(...), and deployment.update_calculated_fields(...)
when making the change.
- Around line 998-1000: The loop that calls
deployment.update_calculated_fields(save=True) inside the save_results Celery
task should be made resilient to per-deployment failures: instead of letting any
exception abort the whole task, wrap each
deployment.update_calculated_fields(save=True) call in a try/except that logs
the exception (use the module/task logger via logger.exception or similar) and
continues to the next deployment; keep the existing logic that builds
deployment_ids ({img.deployment_id for img in source_images}) and the
Deployment.objects.filter(...) iterator, but ensure failures are handled
per-deployment so a single flaky update_calculated_fields call doesn’t fail the
entire save_results task (match the fire-and-forget resiliency used by
update_calculated_fields_for_events).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d6b07ac2-2353-4514-b2ad-06902b86ed9e

📥 Commits

Reviewing files that changed from the base of the PR and between 938841e and 40e9368.

📒 Files selected for processing (2)

ami/ml/models/pipeline.py
ami/ml/tests.py

mihow · 2026-04-17T06:20:17Z

@annavik confirmed that occurrence & taxa counts are finally going up during and after processing! I put this off because I thought the answer was much more difficult!

Copilot AI review requested due to automatic review settings April 17, 2026 05:41

Copilot started reviewing on behalf of mihow April 17, 2026 05:42 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Comment thread ami/ml/models/pipeline.py

Merge branch 'main' into feat/refresh-counts-after-ml-job

40e9368

coderabbitai Bot reviewed Apr 17, 2026

View reviewed changes

Comment thread ami/ml/tests.py

mihow merged commit 201cfa7 into main Apr 17, 2026
7 checks passed

mihow deleted the feat/refresh-counts-after-ml-job branch April 17, 2026 06:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(counts): refresh deployment cached counts in pipeline.save_results#1243

fix(counts): refresh deployment cached counts in pipeline.save_results#1243
mihow merged 2 commits intomainfrom
feat/refresh-counts-after-ml-job

mihow commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

netlify Bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

netlify Bot commented Apr 17, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

mihow commented Apr 17, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

mihow commented Apr 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Notes

Summary by CodeRabbit

Release Notes

Uh oh!

netlify Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-preview canceled.

Uh oh!

netlify Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-ssec canceled.

Uh oh!

coderabbitai Bot commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mihow commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

mihow commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

netlify Bot commented Apr 17, 2026 •

edited

Loading

netlify Bot commented Apr 17, 2026 •

edited

Loading

coderabbitai Bot commented Apr 17, 2026 •

edited

Loading

mihow commented Apr 17, 2026 •

edited

Loading