Worker implementation for processing images in the Antenna task queue (PSv2) by carlosgjs · Pull Request #94 · RolnickLab/ami-data-companion

carlosgjs · 2025-10-20T21:55:50Z

Implements a worker service (ami worker) that processes images queued by users in the Antenna web platform. When users upload images to Antenna and request them to be processed, this worker pulls tasks from the queue via the jobs API, runs them through the local ML pipeline (detection + classification), and posts results back. This allows processing to be run behind a firewall (for example in a university HPCC) and for any number of workers to process images in parallel.

This is the counterpart to RolnickLab/antenna#987 which adds the job queue API to Antenna.

Usage

# Configure via environment variables or .env file
export AMI_ANTENNA_API_AUTH_TOKEN=<token>
export AMI_ANTENNA_API_BASE_URL=http://localhost:8000/api/v2  # default

# Register pipelines with projects (optional)
ami worker register "My Worker" --project 1 --project 2

# Start processing jobs - all pipelines by default
ami worker

# Or specify specific pipeline(s)
ami worker --pipeline uk_denmark_moths_2023
ami worker --pipeline moth_binary --pipeline panama_moths_2024

Or configure in settings:

antenna_api_base_url - Antenna API endpoint (default: http://localhost:8000/api/v2)
antenna_api_auth_token - Authentication token for Antenna project
antenna_api_batch_size - Number of tasks per batch (default: 4)

Architecture

The worker functionality is implemented in a dedicated trapdata/antenna/ module for separation of concerns and future portability:

trapdata/antenna/client.py - API client for fetching jobs and posting results
trapdata/antenna/worker.py - Worker loop and job processing logic
trapdata/antenna/registration.py - Pipeline registration with Antenna projects
trapdata/antenna/schemas.py - Pydantic models for Antenna API requests/responses
trapdata/antenna/datasets.py - RESTDataset for streaming tasks from the API
trapdata/antenna/tests/ - Integration tests with mock Antenna API server
trapdata/cli/worker.py - Thin CLI wrapper (~75 lines) that delegates to the antenna module

Changes

Worker implementation (trapdata/antenna/worker.py)

Polls /jobs endpoint for available jobs by pipeline slug
Fetches tasks from /jobs/{id}/tasks
Processes images through localization + classification pipeline
Posts results back to /jobs/{id}/result/
Continuous loop with configurable polling interval

API client (trapdata/antenna/client.py)

get_jobs() - Fetches job IDs for a given pipeline
post_batch_results() - Posts processed results back to Antenna

Pipeline registration (trapdata/antenna/registration.py)

register_pipelines() - Registers available pipelines with Antenna projects
get_user_projects() - Fetches accessible projects from Antenna API
Reads pipeline configurations from CLASSIFIER_CHOICES

Session handling (trapdata/api/utils.py)

get_http_session() - Creates HTTP session with connection pooling
Automatic retries with exponential backoff via urllib3 adapter (hardcoded: 3 retries, 0.5s backoff)
Only retries on server errors (5XX) and network failures, not client errors (4XX)

Schemas (trapdata/antenna/schemas.py)

Pydantic models for Antenna API: AntennaJobsListResponse, AntennaTasksListResponse, AntennaTaskResult
AsyncPipelineRegistrationRequest and AsyncPipelineRegistrationResponse for pipeline registration

Dataset (trapdata/antenna/datasets.py)

RESTDataset - IterableDataset that streams tasks from Antenna API
Downloads images on-the-fly with error handling
Reports download failures back to Antenna

Settings (trapdata/settings.py)

Added antenna_api_base_url, antenna_api_auth_token, antenna_api_batch_size with Kivy UI integration

CLI (trapdata/cli/worker.py)

ami worker - Default command to run the worker (processes all pipelines)
ami worker --pipeline <slug> - Process specific pipeline(s) (repeatable flag)
ami worker register <name> - Register pipelines with Antenna projects
Uses @cli.callback(invoke_without_command=True) to make worker the default action

Tests (trapdata/antenna/tests/)

Integration tests with real ML inference (detector + classifier)
Mock Antenna API server for testing worker interactions
Tests for registration client functions
End-to-end test: register → get jobs → process → post results

Credits

@carlosgjs - Original worker and pipeline registration implementation
@mihow - Integration tests, session handling, Settings pattern refactoring, module extraction

Test plan

Run ami test all - all tests pass
Manual test with local Antenna instance
Integration tests with real ML inference
Worker handles network errors gracefully (via retry logic)
Test with GPU and CPU-only environments

Summary by CodeRabbit

Release Notes

New Features
- Added Antenna Worker with pipeline execution and registration capabilities
- Introduced REST API dataset integration for streaming data processing
- Added pipeline registration across multiple projects
- Extended CLI with new worker subcommand group
Bug Fixes
- Fixed conditional CUDA cache clearing for systems without GPU support
Configuration
- Added Antenna API configuration environment variables
- Updated pre-commit hooks to latest versions
- Enhanced pytest test discovery paths
Documentation
- Added comprehensive Antenna Worker setup and usage guide
Tests
- Added mock Antenna API server for integration testing
- Implemented comprehensive worker integration and unit test coverage

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Copilot

Pull request overview

Adds an end-to-end “pulling (v2)” worker implementation for PyTorch inference that polls an Antenna service for queued tasks, runs detection/classification, and posts results back.

Changes:

Added ami worker CLI command and worker runtime to poll /api/v2/jobs and post /result/ payloads.
Implemented REST-backed IterableDataset + DataLoader plumbing for streaming tasks and image downloads.
Added small API/model utilities (service-info caching, reset helpers, task schema) to support the worker flow.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
trapdata/common/utils.py	Adds a small timing/logging helper used by the worker.
trapdata/cli/worker.py	New worker loop: fetch jobs, process batches, post results.
trapdata/cli/base.py	Registers `worker` CLI command and validates pipeline args.
trapdata/api/schemas.py	Adds `PipelineProcessingTask` schema for REST task payloads.
trapdata/api/models/localization.py	Adds detector reset helper; simplifies result saving loop.
trapdata/api/models/classification.py	Adds classifier reset + shared “update detection classification” helper.
trapdata/api/datasets.py	Adds REST task `IterableDataset`, collate fn, and DataLoader factory.
trapdata/api/api.py	Caches `/info` response via FastAPI lifespan initialization.
.vscode/launch.json	Adds VS Code launch configs for debugging worker and API.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Add three new settings to configure the Antenna API worker: - antenna_api_base_url: Base URL for Antenna API (defaults to localhost:8000/api/v2) - antenna_api_auth_token: Authentication token for Antenna project - antenna_api_batch_size: Number of tasks to fetch per batch (default: 4) These settings replace hardcoded environment variables and follow the existing Settings pattern with AMI_ prefix and Kivy metadata. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Add schemas to validate API responses from Antenna: - AntennaJobListItem: Single job with id field - AntennaJobsListResponse: List of jobs from GET /api/v2/jobs - AntennaTasksListResponse: List of tasks from GET /api/v2/jobs/{id}/tasks Also rename PipelineProcessingTask to AntennaPipelineProcessingTask for clarity. These schemas provide type safety, validation, and clear documentation of the expected API response format. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Changes: - Replace os.environ.get() with Settings object for configuration - Add validation for antenna_api_auth_token with clear error message - Use Pydantic AntennaJobsListResponse schema for type-safe API parsing - Use urljoin for safe URL construction instead of f-strings - Improve error handling with separate exception catch for validation errors This follows the existing Settings pattern and provides better type safety and validation. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Changes: - Use Pydantic AntennaTasksListResponse schema for type-safe API parsing - Raise exceptions instead of returning None for network errors (more Pythonic) - Fix error tuple bug: row["error"] was incorrectly wrapped in tuple - Use urljoin for safe URL construction - Add API contract documentation about atomic task dequeue - Update to use Settings object for configuration The exception-based error handling is clearer than checking for None vs empty list. The retry logic now catches RequestException explicitly. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Add proper type annotations for the predictions parameter: - seconds_per_item: float - image_id: str - detection_idx: int - predictions: ClassifierResult (instead of comment) This improves type checking and IDE support. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Add comprehensive documentation for running the Antenna worker: - Setup instructions with environment variable configuration - Example commands for running with single or multiple pipelines - Explanation of worker behavior and safety with parallel workers - Notes about authentication and safety Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Poetry lock file regenerated by Poetry 2.1.2 with updated dependencies: - alembic: 1.14.0 → 1.18.1 - anyio: 4.6.2.post1 → 4.12.1 - Added: annotated-doc 0.0.4 - Format changes: category → groups, added platform markers This is a side effect of running the development environment. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Changes: - Remove urljoin import from datasets.py and worker.py - Replace urljoin() calls with f"{base_url.rstrip('/')}/path" pattern - Remove base_url trailing slash manipulation in RESTDataset.__init__ The urljoin behavior is unintuitive: it treats the last path segment as a file and replaces it when joining relative paths. This required every call site to ensure the base URL had a trailing slash, which is fragile. The f-string approach is clearer and handles all edge cases (no slash, one slash, multiple slashes) without requiring state modification or scattered string checks. Files changed: - trapdata/api/datasets.py:5, 143-144, 160 - trapdata/cli/worker.py:6, 43-46, 70-73 Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

* Pipeline registration * Convert worker tests to integration tests with real ML inference Replaces fully mocked unit tests with integration tests that validate the Antenna API contract and run actual ML models. Tests now exercise the worker's unique code path (RESTDataset → rest_collate_fn) with real image loading and inference. Changes: - Add trapdata/api/tests/utils.py with shared test utilities - Add trapdata/api/tests/antenna_api_server.py to mock Antenna API - Rewrite test_worker.py as integration tests (17 tests, all passing) - Update test_api.py to use shared utilities Tests validate: real detector/classifier inference, HTTP image loading, schema compliance, batch processing, and end-to-end workflow. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> * Add AsyncPipelineRegistrationResponse schema Add Pydantic model to validate responses from pipeline registration API. Fields: pipelines_created, pipelines_updated, processing_service_id. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Refactor registration functions to use get_http_session Update get_user_projects() and register_pipelines_for_project() to use the session-based HTTP pattern established in PR RolnickLab#104: - Use get_http_session() context manager for connection pooling - Add retry_max and retry_backoff parameters with defaults - Remove manual header management (session handles auth) - Standardize URL paths (base_url now includes /api/v2) - Use Pydantic model validation for API responses - Fix error handling with hasattr() check Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add integration tests for pipeline registration Add mock Antenna API endpoints: - GET /api/v2/projects/ - list user's projects - POST /api/v2/projects/{id}/pipelines/ - register pipelines Add TestRegistrationIntegration with 2 client tests: - test_get_user_projects - test_register_pipelines_for_project Update TestWorkerEndToEnd.test_full_workflow_with_real_inference to include registration step: register → get jobs → process → post results. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Add git add -p to recommended development practices Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Read retry settings from Settings in get_http_session() When max_retries or backoff_factor are not explicitly provided, get_http_session() now reads defaults from Settings (antenna_api_retry_max and antenna_api_retry_backoff). This centralizes retry configuration and allows callers to omit these low-level parameters. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * Use Settings pattern in register_pipelines() - Accept Settings object instead of base_url/auth_token params - Remove direct os.environ.get() calls for ANTENNA_API_* vars - Fix error message to reference correct env var (AMI_ANTENNA_API_AUTH_TOKEN) - Remove retry params from get_user_projects() and register_pipelines_for_project() since get_http_session() now reads settings internally - Remove unused os import Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Carlos Garcia Jurado Suarez <carlos@irreverentlabs.com> Co-authored-by: Carlos Garcia Jurado Suarez <carlosgjs@live.com> Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@trapdata/cli/worker.py`:
- Around line 239-245: Clamp bbox coordinates from dresp.bbox to the image
tensor bounds before slicing: retrieve image shape from
image_tensors[dresp.source_image_id], convert bbox coords to ints and clamp
x1,x2 to [0,width] and y1,y2 to [0,height]; verify x2>x1 and y2>y1
(skip/continue for invalid boxes) and only then perform the crop and
unsqueeze—update the crop logic around image_tensors, dresp.bbox, and crop to
include these checks.
- Around line 198-203: Validate lengths before performing any zip operations to
avoid silent truncation: check that len(image_ids) equals len(images) before
zipping image_ids and images (the zip(image_ids, images) call) and check that
len(image_ids) == len(images) == len(reply_subjects) == len(image_urls) before
zipping image_ids, reply_subjects, image_urls, images; if lengths differ, raise
a ValueError with a clear message. Also update the zip calls to use zip(...,
strict=True) (Python 3.10+) so mismatches fail fast; reference the variables
image_ids, images, reply_subjects, image_urls and the zip usages in the worker
function when making these changes.

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@trapdata/api/tests/antenna_api_server.py`:
- Around line 10-19: The import list in trapdata/api/tests/antenna_api_server.py
includes an unused symbol AntennaTaskResults; remove AntennaTaskResults from the
from trapdata.api.schemas import (...) statement so the module only imports the
actually used names (e.g., AntennaJobListItem, AntennaJobsListResponse,
AntennaPipelineProcessingTask, AntennaTaskResult, AntennaTasksListResponse,
AsyncPipelineRegistrationRequest, AsyncPipelineRegistrationResponse).

In `@trapdata/api/tests/test_worker.py`:
- Around line 40-41: The test class TestRestCollateFn is missing
unittest.TestCase inheritance so unittest discovery won't run it; update the
class definition to inherit from unittest.TestCase (i.e., class
TestRestCollateFn(unittest.TestCase):) and add an import for unittest if absent,
and while here convert any bare assert statements inside TestRestCollateFn to
unittest assertions (self.assertEqual / self.assertRaises, etc.) so the tests
run correctly under the project's unittest runner.

🧹 Nitpick comments (1)

trapdata/api/tests/antenna_api_server.py (1)

30-45: Consider using filter parameters for more realistic mock behavior.

The pipeline__slug, ids_only, and incomplete_only parameters are accepted but ignored. While acceptable for a basic mock, using these parameters would enable tests to verify that the worker sends correct query params. For instance, you could filter _jobs_queue by a stored pipeline slug or return full job objects when ids_only=0.

If this level of fidelity isn't needed now, consider adding a # noqa: ARG001 comment to suppress the linter warnings.

- Initialize worker_id before try block to prevent UnboundLocalError - Remove unused AntennaTaskResults import from antenna_api_server - Remove unnecessary noqa directive from test.py (not flagged by Ruff) - Add TestCase inheritance to TestRestCollateFn for consistency Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@trapdata/api/datasets.py`:
- Around line 193-218: In _load_image, fix orientation by applying EXIF-based
transpose immediately after opening the image and before converting mode or
applying image_transforms: after Image.open(BytesIO(response.content)) call
ImageOps.exif_transpose (or equivalent EXIF transpose helper) to correct
rotation, then continue with the existing mode check and conversion to "RGB",
and then call image_transforms; keep existing error handling and use the same
symbols (image_fetch_session, BytesIO, Image, image_transforms).

Changes get_http_session to only retry GET requests by default, preventing unintended duplicate operations from POST retries. Adds retry_methods parameter (default: ("GET",)) to allow callers to explicitly opt-in to POST retries for idempotent endpoints. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Add length validation before zip operations in worker.py to prevent silent truncation; use strict=True for Python 3.10+ fail-fast behavior - Replace bare assert with explicit ValueError in classification.py for clearer error messages when image_id mismatches occur - Fix comment reference in antenna_api_server.py test helper Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@trapdata/api/tests/antenna_api_server.py`:
- Around line 29-44: The get_jobs handler currently receives FastAPI query
params pipeline__slug, ids_only, and incomplete_only but doesn't use them; to
satisfy linters without renaming the parameters, explicitly mark them as
intentionally unused inside get_jobs (e.g., assign them to a throwaway variable
or use typing.cast/# pragma if preferred) so the function signature stays
unchanged; update the body of get_jobs to reference pipeline__slug, ids_only,
and incomplete_only in a no-op way (e.g., _ = pipeline__slug; _ = ids_only; _ =
incomplete_only) before computing job_ids.

In `@trapdata/cli/worker.py`:
- Line 171: Replace unguarded calls to torch.cuda.empty_cache() with a guarded
check using torch.cuda.is_available(): wrap the call as if
torch.cuda.is_available(): torch.cuda.empty_cache(). Update the call in the
worker module where torch.cuda.empty_cache() is invoked and make the same change
in the model base module (trapdata.ml.models.base) so CPU-only builds skip CUDA
cache clearing and device selection logic auto-detects GPU availability.
- Around line 310-312: post_batch_results(...) return value is ignored causing
silent data loss when posting fails; change the call in the worker loop to
capture its boolean result (e.g., success = post_batch_results(settings, job_id,
batch_results)), only add its timing to total_save_time when success, and on
failure either requeue the batch (so RESTDataset tasks aren’t lost) or raise/log
a clear error so the job can be retried/inspected; update the code around
post_batch_results, job_id, batch_results and total_save_time to implement this
check and appropriate error handling.

- Remove test_single_item (covered by test_all_successful) - Remove duplicate test_empty_queue tests (keep one in TestProcessJobIntegration) - Remove test_query_params_sent (weak test with no real assertions) - Remove TestRegistrationIntegration class (covered by E2E test) - Remove basic RESTDataset tests covered by integration tests Reduces test count from 18 to 11 while maintaining meaningful coverage. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Move Antenna platform integration code from cli/worker.py into a dedicated trapdata/antenna/ module for better separation of concerns and future portability to a standalone worker app. New module structure: - antenna/client.py: API client for fetching jobs and posting results - antenna/worker.py: Worker loop and job processing logic - antenna/registration.py: Pipeline registration with Antenna projects - antenna/schemas.py: Pydantic models for Antenna API - antenna/datasets.py: RESTDataset for streaming tasks from API - antenna/tests/: Worker integration tests cli/worker.py is now a thin CLI wrapper (~70 lines) that delegates to the antenna module. Co-Authored-By: Carlos Garcia <carlosgjs@users.noreply.github.com>

coderabbitai

Actionable comments posted: 9

🤖 Fix all issues with AI agents

In `@docs/claude/planning/antenna-module-refactor.md`:
- Around line 108-116: Update the documented test paths and validation checklist
to use the new module layout: replace occurrences of
"trapdata/api/tests/test_worker.py" with "trapdata/antenna/tests/test_worker.py"
(including the pytest command in the example code block and any checklist
entries noted around lines referenced, e.g., the validation checklist and the
commands block showing "pytest ..."). Search the file for any other mentions of
the old path and update them to the new "trapdata/antenna/tests/test_worker.py"
so the example commands and checklist accurately reflect the current structure.

In `@docs/claude/planning/simplify-worker-tests.md`:
- Around line 15-26: The markdown tables containing rows like
`TestRESTDatasetIntegration.test_empty_queue`,
`TestGetJobsIntegration.test_empty_queue`,
`TestProcessJobIntegration.test_empty_queue`,
`TestRESTDatasetIntegration.test_multiple_batches`, and
`TestWorkerEndToEnd.test_multiple_batches_processed` are flagged by markdownlint
MD060 for pipe spacing; fix by normalizing pipe spacing and column alignment
(ensure a single space after and before each pipe or run a markdown table
formatter) so each row and header aligns consistently, then re-run the linter/CI
to confirm the MD060 warning is resolved.

In `@docs/claude/planning/worker-integration-tests.md`:
- Around line 336-345: The markdown table starting with the header "| Test Class
| Tests | Type | What It Tests |" is triggering MD060 due to inconsistent pipe
spacing; fix by normalizing column pipe alignment and spacing (e.g., ensure a
single space on both sides of each pipe for every row) or run the project's
markdown formatter (prettier/markdownlint) to reflow the table, and if alignment
cannot be preserved, add an inline markdownlint disable comment for MD060 above
the table to suppress the warning; target the table block and the similar table
at the later occurrence (the block beginning with the same header).

In `@scripts/validate_dwc_export.py`:
- Around line 69-74: The subspecies count uses a misspelled field 'tqaxonRank'
which will KeyError; update the generator to check the correct field name
'taxonRank' (same as used for species) when computing subspecies from rows
(i.e., fix the expression that computes subspecies to reference row['taxonRank']
instead of row['tqaxonRank']), leaving the rest of the logic (variables species,
subspecies, rows and the print statements) unchanged.
- Around line 36-38: There is a typo in the loop over rank_counts: the computed
percentage variable is named pct but the print uses qpct causing a NameError;
update the print statement inside the loop that iterates "for rank, count in
sorted(rank_counts.items())" to reference pct instead of qpct so it prints f" 
{rank}: {count} ({pct:.1f}%)".

In `@trapdata/antenna/client.py`:
- Around line 3-56: The broad except in get_jobs should be narrowed: replace the
generic "except Exception" that wraps resp.json() and
AntennaJobsListResponse.model_validate() with two specific handlers catching
json.JSONDecodeError (from calling resp.json()) and pydantic.ValidationError
(from AntennaJobsListResponse.model_validate()). Import json and
pydantic.ValidationError (or from pydantic import ValidationError), log distinct
error messages (e.g., "Failed to decode JSON" and "Failed to validate jobs
response") referencing base_url and the exception, and return [] in each
handler; keep the existing requests.RequestException handler for request errors.

In `@trapdata/antenna/schemas.py`:
- Around line 5-9: The import list in the module currently includes an unused
symbol ProcessingServiceInfoResponse; remove ProcessingServiceInfoResponse from
the from-import tuple (the line importing PipelineConfigResponse,
PipelineResultsResponse, ProcessingServiceInfoResponse) so only the actually
used symbols (PipelineConfigResponse and PipelineResultsResponse) are imported
to satisfy flake8.

In `@trapdata/cli/base.py`:
- Around line 2-8: The file imports unused symbols Annotated and
CLASSIFIER_CHOICES which will fail linting; remove Annotated from the typing
import (leave Optional) and delete the import of CLASSIFIER_CHOICES from
trapdata.api.api so only used symbols remain, then re-run flake8 to verify no
unused-import errors.

In `@trapdata/cli/test.py`:
- Line 27: The subprocess call using the bare "pytest" is PATH-dependent; change
the invocation of subprocess.call that sets return_code (the call using
subprocess.call(["pytest", "-v"])) to use the current Python interpreter by
replacing the args with [sys.executable, "-m", "pytest", "-v"] and add an import
sys at the top of the module so the active virtualenv's pytest module is used
reliably.

🧹 Nitpick comments (6)

scripts/validate_dwc_export.py (1)
21-23: Consider handling empty files to avoid division by zero.

If the TSV file contains no data rows, total_taxa will be 0, causing ZeroDivisionError on subsequent percentage calculations (lines 29, 37, 53, 60, 66).
🛡️ Proposed early exit for empty files
     total_taxa = len(rows)
+    if total_taxa == 0:
+        print("No taxa found in file.")
+        return
     print(f"Total Taxa: {total_taxa}")
trapdata/api/api.py (1)

358-373: Avoid eager model initialization when building pipeline configs.
initialize_service_info() instantiates detector/classifier objects for every pipeline; if constructors load weights, startup and CLI registration can become heavy. Consider a lightweight metadata path (e.g., classmethod/constant config) and defer model loading until inference.
trapdata/antenna/schemas.py (1)
64-71: Prefer default_factory for list fields for consistency with the rest of the file.
Other list fields in this file (lines 57, 79–88) use pydantic.Field(default_factory=list). This pattern is also recommended in Pydantic v2 to make intent explicit.
♻️ Suggested change
 class AsyncPipelineRegistrationRequest(pydantic.BaseModel):
     """
     Request to register pipelines from an async processing service
     """
 
     processing_service_name: str
-    pipelines: list[PipelineConfigResponse] = []
+    pipelines: list[PipelineConfigResponse] = pydantic.Field(default_factory=list)
trapdata/antenna/datasets.py (2)
122-137: Consider handling EXIF orientation for rotated images.

The code converts to RGB correctly, but per coding guidelines images should also handle EXIF orientation. Some camera trap images may have EXIF rotation metadata that could cause the model to receive incorrectly oriented images.

The broad Exception catch is acceptable here since external image loading can fail in many unpredictable ways (network errors, invalid formats, corrupted data, etc.).
♻️ Proposed fix to handle EXIF orientation
             response = self.image_fetch_session.get(image_url, timeout=30)
             response.raise_for_status()
             image = Image.open(BytesIO(response.content))
 
+            # Handle EXIF orientation
+            from PIL import ImageOps
+            image = ImageOps.exif_transpose(image)
+
             # Convert to RGB if necessary
             if image.mode != "RGB":
                 image = image.convert("RGB")
As per coding guidelines: "Handle EXIF orientation when preprocessing images; ensure models receive RGB format"

179-205: Minor: Redundant conditional and commented debug code.

Line 204: The ternary if errors else None is redundant since we're already inside an if errors: block - errors is guaranteed truthy here.

Lines 182, 186: Commented-out log_time calls should be removed or uncommented if still needed for debugging.
♻️ Proposed cleanup
                 for task in tasks:
                     errors = []
                     # Load the image
-                    # _, t = log_time()
                     image_tensor = (
                         self._load_image(task.image_url) if task.image_url else None
                     )
-                    # _, t = t(f"Loaded image from {image_url}")
 
                     if image_tensor is None:
                         errors.append("failed to load image")
 
                     if errors:
                         logger.warning(
                             f"Worker {worker_id}: Errors in task for image '{task.image_id}': {', '.join(errors)}"
                         )
 
                     # Yield the data row
                     row = {
                         "image": image_tensor,
                         "reply_subject": task.reply_subject,
                         "image_id": task.image_id,
                         "image_url": task.image_url,
                     }
                     if errors:
-                        row["error"] = "; ".join(errors) if errors else None
+                        row["error"] = "; ".join(errors)
                     yield row
trapdata/antenna/tests/test_worker.py (1)
374-383: Consider extracting shared _make_settings helper to reduce duplication.

The _make_settings method is duplicated between TestProcessJobIntegration and TestWorkerEndToEnd classes. Consider extracting it to a module-level helper or a shared test base class.
♻️ Proposed refactor
# At module level or in a shared test utilities module
def make_test_settings():
    """Create mock settings for worker tests."""
    settings = MagicMock()
    settings.antenna_api_base_url = "http://testserver/api/v2"
    settings.antenna_api_auth_token = "test-token"
    settings.antenna_api_batch_size = 2
    settings.antenna_api_retry_max = 3
    settings.antenna_api_retry_backoff = 0.5
    settings.num_workers = 0
    settings.localization_batch_size = 2
    return settings
Then use in both test classes:
def _make_settings(self):
    return make_test_settings()
Also applies to: 224-234

- Remove retry_max and retry_backoff from Settings (hardcoded in get_http_session) - get_http_session(auth_token=None) takes optional auth param - Client functions take base_url and auth_token explicitly - RESTDataset takes auth_token for API session, no auth for image fetching Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Prevents crashes on CPU-only builds by checking torch.cuda.is_available() before calling torch.cuda.empty_cache() in worker and model base modules. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Ensures the active virtualenv's pytest module is used by calling [sys.executable, "-m", "pytest", "-v"] instead of relying on PATH. This prevents failures when pytest is not in PATH or using the wrong pytest version. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Captures the boolean return value from post_batch_results() and raises RuntimeError if posting fails, preventing silent data loss. Only increments total_save_time on successful posts. This makes API posting failures visible rather than silently discarding processed results. External retry mechanisms (systemd, supervisord) can handle job-level retries. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

…ine flag - Change from 'ami worker run' to just 'ami worker' using @cli.callback(invoke_without_command=True) - Change flag from --pipelines (plural) to --pipeline (singular, repeatable) - Update README with new command structure and registration examples - Follows standard Docker/kubectl pattern for repeatable options Usage: ami worker # all pipelines ami worker --pipeline moth_binary # single ami worker --pipeline moth1 --pipeline moth2 # multiple Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Captures the boolean return value from post_batch_results() and raises RuntimeError if posting fails, preventing silent data loss. Only increments total_save_time on successful posts. Added exception handling in run_worker() loop to catch job processing failures and continue to next job rather than crashing the worker. This ensures the worker keeps consuming jobs even when individual jobs fail. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@trapdata/antenna/tests/test_worker.py`:
- Around line 139-146: Replace hardcoded auth token strings in tests with an
environment-backed test constant: create a single constant (e.g.,
TEST_AUTH_TOKEN = os.getenv("AMI_TEST_AUTH_TOKEN", "test-token")) and use it
wherever RESTDataset is instantiated (refer to _make_dataset and other test
helpers that call RESTDataset), so calls like RESTDataset(...,
auth_token="test-token") become auth_token=TEST_AUTH_TOKEN; ensure you import os
and define the constant once at the top of the test module to keep tests
deterministic and satisfy Ruff S105/S106.

🧹 Nitpick comments (2)

trapdata/antenna/datasets.py (1)
96-121: Consider handling EXIF orientation when loading images.

The _load_image method converts images to RGB but doesn't handle EXIF orientation metadata. Images from cameras may appear rotated incorrectly. As per coding guidelines: "Handle EXIF orientation when preprocessing images."
♻️ Proposed fix to handle EXIF orientation
+from PIL import ImageOps
 ...
             image = Image.open(BytesIO(response.content))
 
+            # Handle EXIF orientation before any other processing
+            image = ImageOps.exif_transpose(image)
+
             # Convert to RGB if necessary
             if image.mode != "RGB":
                 image = image.convert("RGB")
trapdata/antenna/tests/test_worker.py (1)
207-487: Consider gating the real-inference integration tests behind an opt‑in flag.
These tests run real ML inference (not mocked) and local file servers; if included in default CI runs, they can be slow or fail on CPU‑only or model‑less environments.
♻️ Suggested opt‑in gating (unittest)
+import os
-from unittest import TestCase
+from unittest import TestCase, skipUnless
@@
+RUN_INTEGRATION = os.getenv("RUN_INTEGRATION_TESTS") == "1"
@@
+@skipUnless(RUN_INTEGRATION, "integration tests are opt-in")
 class TestProcessJobIntegration(TestCase):
@@
+@skipUnless(RUN_INTEGRATION, "integration tests are opt-in")
 class TestWorkerEndToEnd(TestCase):

carlosgjs added 4 commits October 15, 2025 15:26

WIP: Pull worker and REST dataset

9129421

Clean-up, addd "worker" cli command, move token to env var

41fef93

Post results back

87910aa

Progress updates working

c67afce

carlosgjs mentioned this pull request Oct 24, 2025

Distributed Worker Architecture for ML Processing (Processing Service V2) pt. 1 RolnickLab/antenna#987

Merged

5 tasks

clean up

64e188d

carlosgjs changed the title ~~WIP: Worker implemention for PyTorch models~~ Worker implemention for PyTorch models Oct 24, 2025

carlosgjs marked this pull request as ready for review October 24, 2025 19:43

carlosgjs changed the title ~~Worker implemention for PyTorch models~~ Worker implementation for PyTorch models Oct 24, 2025

carlosgjs added 4 commits November 4, 2025 15:53

Better error handling

c00de9d

Support multiple pipelines

3b60538

Use app.state for the service info

45e68bc

API launch target

3c4dd8c

carlosgjs commented Dec 5, 2025

View reviewed changes

Comment thread trapdata/api/api.py

carlosgjs and others added 3 commits December 9, 2025 15:09

Integration fixes

8f76365

Use PipelineProcessingTask instead of raw dicts

bef1cd7

Fix to returned results

52cff32

mihow reviewed Jan 16, 2026

View reviewed changes

Comment thread trapdata/cli/base.py Outdated

mihow requested a review from Copilot January 24, 2026 02:50

Copilot started reviewing on behalf of mihow January 24, 2026 02:50 View session

Copilot AI reviewed Jan 24, 2026

View reviewed changes

mihow and others added 9 commits January 23, 2026 19:22

Trigger CI workflows

f3f3cd6

mihow force-pushed the carlosg/pulldl branch from c963315 to 9bd7142 Compare January 29, 2026 02:32

coderabbitai Bot reviewed Jan 29, 2026

View reviewed changes

Comment thread trapdata/cli/worker.py Outdated

Comment thread trapdata/cli/worker.py Outdated

coderabbitai Bot reviewed Jan 29, 2026

View reviewed changes

Comment thread trapdata/api/tests/antenna_api_server.py Outdated

Comment thread trapdata/api/tests/test_worker.py Outdated

coderabbitai Bot reviewed Jan 29, 2026

View reviewed changes

Comment thread trapdata/api/datasets.py Outdated

mihow and others added 2 commits January 29, 2026 12:02

coderabbitai Bot reviewed Jan 29, 2026

View reviewed changes

Comment thread trapdata/antenna/tests/antenna_api_server.py

Comment thread trapdata/cli/worker.py Outdated

Comment thread trapdata/cli/worker.py Outdated

mihow and others added 3 commits January 29, 2026 12:13

chore: remove temporary plans

3825517

coderabbitai Bot reviewed Jan 30, 2026

View reviewed changes

mihow and others added 8 commits January 29, 2026 17:04

feat: add example service file for Antenna worker, add comments

1c5ed89

fix: guard torch.cuda.empty_cache() calls with is_available() check

b427ed2

Prevents crashes on CPU-only builds by checking torch.cuda.is_available() before calling torch.cuda.empty_cache() in worker and model base modules. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

chore: remove validate_dwc_export.py (not meant for this PR)

c4df11c

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

coderabbitai Bot reviewed Jan 30, 2026

View reviewed changes

Comment thread trapdata/antenna/tests/test_worker.py

mihow merged commit da630f2 into RolnickLab:main Jan 30, 2026
3 checks passed

coderabbitai Bot mentioned this pull request Feb 10, 2026

fix: worker image size mismatches, per-batch error handling, batched classification #110

Merged

2 tasks

coderabbitai Bot mentioned this pull request Feb 20, 2026

PSv2: Async result posting, benchmarking #113

Merged

coderabbitai Bot mentioned this pull request Mar 25, 2026

feat: add feature vector extraction to classification responses #77

Open

3 tasks

This was referenced Apr 3, 2026

refactor: update Antenna API client for POST /tasks and wrapped /result #134

Merged

feat: switch to API key auth for Antenna endpoints #136

Open

coderabbitai Bot mentioned this pull request Apr 15, 2026

feat: add Mothbot YOLO11m detection pipeline #141

Open

6 tasks

mihow mentioned this pull request Apr 17, 2026

feat(processing_services): v2 worker mode for minimal stub RolnickLab/antenna#1252

Draft

9 tasks

Conversation

carlosgjs commented Oct 20, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Usage

Architecture

Changes

Credits

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

carlosgjs commented Oct 20, 2025 •

edited by coderabbitai Bot

Loading