Skip to content

Eliminate toarray() for spectral/spatial metrics#1719

Open
Bisho2122 wants to merge 3 commits intomasterfrom
Fix/eliminate-toarray
Open

Eliminate toarray() for spectral/spatial metrics#1719
Bisho2122 wants to merge 3 commits intomasterfrom
Fix/eliminate-toarray

Conversation

@Bisho2122
Copy link
Copy Markdown
Collaborator

Problem

Follow-up to #1718 . The annotation pipeline sometimes crashes with OOM on datasets where spectra form isolated blobs inside a large bounding box. After #1718 fixed chaos_metric, the remaining crash point is formula_validator.py which still materialises the full h×w dense array for every peak of every formula when building iso_imgs_flat.

Change

  • imzml_reader gets a pixel_to_flat_idx lookup table built once at init — maps any pixel coordinate directly to its position in the
    masked-flat metrics array.
  • formula_validator.py uses it to scatter sparse coo values directly into a 1D array of size n_spectra, removing the toarray() call and the intermediate dense iso_imgs list entirely.
  • pipeline.py logs a warning when pixel density is below 5% so blob datasets are visible in logs.

@Bisho2122 Bisho2122 requested a review from lmacielvieira April 17, 2026 16:31
@Bisho2122 Bisho2122 self-assigned this Apr 17, 2026
@Bisho2122 Bisho2122 added the enhancement New feature or request label Apr 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant