Skip to content

phase3: iclabel classification + non-brain IC flagging#32

Draft
neuromechanist wants to merge 1 commit into
mainfrom
feature/issue-4-phase3-iclabel
Draft

phase3: iclabel classification + non-brain IC flagging#32
neuromechanist wants to merge 1 commit into
mainfrom
feature/issue-4-phase3-iclabel

Conversation

@neuromechanist

Copy link
Copy Markdown
Member

Summary

Phase 3 of the HBN ERSP epic. Per subject: load Phase 2 .setpop_iclabel('default') → flag ICs where brain probability < 0.69 (the locked threshold per .context/ideas.md) → render class summary + brain-IC topo PNGs → save .set checkpoint → qa_iclabel.csv row.

Flag, not delete. ICs below threshold get written to EEG.reject.gcompreject so Phase 4 (epoching) can still inspect them. The rich classifications stay in EEG.etc.ic_classification.ICLabel for downstream use.

Refs epic #1, closes #4 once the 3-subject derivatives + QA review land on this branch.

Code (this commit)

  • src/matlab/phase3_iclabel.m — entrypoint. BrainThreshold default 0.69, IcLabelVersion default "default" (locked), MinBrainIcs default 5 with status="ok_low_brain" on subjects below the floor.
  • src/matlab/+hbn/run_iclabel.m — wraps pop_iclabel. Validates that EEG.etc.ic_classification.ICLabel is populated. Heads-up: ICLabel's docstring says ic_classifications (plural) but the implementation writes ic_classification (singular). Caught locally; downstream code uses the singular form everywhere.
  • src/matlab/+hbn/flag_non_brain_ics.m — validates the class ordering (Brain, Muscle, Eye, Heart, Line Noise, Channel Noise, Other), writes gcompreject, returns argmax-based per-class counts.
  • src/matlab/+hbn/save_ic_class_figure.m — two-panel PNG: per-class bar (brain split into kept vs dropped) + brain-prob scatter vs IC index with the 0.69 threshold line.
  • src/matlab/+hbn/save_brain_ic_topo_figure.m — kept brain ICs on a tight tile grid. Uses the snapshot-handles pattern from Phase 2 so pop_topoplot's own figure gets captured. Writes a labeled placeholder PNG if no brain IC survives, instead of silently erroring.
  • src/matlab/+hbn/write_qa_iclabel_csv.m — schema: participant_id, status, n_components, brain_count, muscle_count, eye_count, heart_count, line_noise_count, channel_noise_count, other_count, brain_threshold, iclabel_version, duration_s, error_message.

Tests

  • tests/matlab/test_phase3_smoke.m — chains Phase 1 → Phase 2 (IcaMethod=runica, MaxIter=100) → Phase 3 on the fixture. Asserts classifications attached, gcompreject length matches IC count, qa_iclabel.csv row consistent, params.json schema complete. Accepts ok or ok_low_brain (the 60s/100-iter wiring smoke is structurally below the brain floor — expected).
  • tests/matlab/run_all_tests.m — adds test_phase3_smoke (6 tests total, all green locally).

Runner

  • scripts/run_phase3_three_subjects.m — production runner. Regenerates Phase 1 + Phase 2 if their .set checkpoints are missing (they are gitignored). Currently running locally; derivatives + QA agent review will be added as a follow-up commit on this branch.

Local verification

test_phase3_smoke: OK (118 ICs, 3 brain kept, 115 flagged)

(Wiring smoke; not a quality signal. Real interpretive numbers come from the AMICA-based 3-subject run.)

Test plan

  • Local smoke (runica + 100 iter): 118 ICs / 3 brain kept / 115 flagged. Smoke wiring works.
  • CI green on this branch.
  • 3-subject AMICA-based run on the local R3 dataset (in flight, ~80 min wall-time).
  • eeg-qa-neuroscientist Phase 3 review on the real-data figures.
  • Derivatives committed; PR body updated.

Follow-up

Phase 4 (event expansion + epoching) picks up the flagged .set files. ICs already in EEG.reject.gcompreject will be honored by std_precomp in Phase 5.

Phase 3 of the HBN ERSP epic. Per-subject pipeline:
  load Phase 2 .set
  -> pop_iclabel('default')
  -> flag ICs where brain probability < 0.69 (locked threshold)
  -> ic-classes bar + brain-IC topo PNG figures
  -> save .set checkpoint
  -> qa_iclabel.csv row

src/matlab/phase3_iclabel.m (entrypoint)
- BrainThreshold default 0.69 (mustBeInRange 0..1). IcLabelVersion
  'default' (locked). MinBrainIcs default 5; subjects below threshold
  get status 'ok_low_brain' and are flagged for Phase 6 escalation
  per .context/research.md L92-94 watch list.
- Flag, not delete: writes EEG.reject.gcompreject so Phase 4 can
  still inspect dropped ICs.

src/matlab/+hbn/ helpers
- run_iclabel.m         : wraps pop_iclabel. Validates that
                          EEG.etc.ic_classification.ICLabel is populated
                          on return (the pop_iclabel docstring claims
                          'ic_classifications' (plural) but the code
                          writes 'ic_classification' (singular); fixed
                          locally with a clear assertion).
- flag_non_brain_ics.m  : validates class ordering (Brain, Muscle, Eye,
                          Heart, Line Noise, Channel Noise, Other);
                          writes EEG.reject.gcompreject; returns
                          per-class counts (argmax-based).
- save_ic_class_figure.m: two-panel PNG with per-class bar (brain
                          split into kept vs dropped) plus brain-prob
                          scatter with the 0.69 threshold line.
- save_brain_ic_topo_figure.m: kept brain IC topographies on a tight
                          tile grid. Uses the snapshot-handles pattern
                          from Phase 2 so pop_topoplot's own figure
                          gets captured (not a pre-allocated empty one).
                          Emits a labelled placeholder PNG when no
                          brain IC survives, instead of silently
                          erroring on an unrecoverable empty find().
- write_qa_iclabel_csv.m: schema: participant_id, status, n_components,
                          brain_count, muscle_count, eye_count,
                          heart_count, line_noise_count,
                          channel_noise_count, other_count,
                          brain_threshold, iclabel_version,
                          duration_s, error_message.

tests/matlab/test_phase3_smoke.m
- Chains phase1 -> phase2 (IcaMethod=runica, MaxIter=100) -> phase3
  on the fixture. Asserts classifications attached, gcompreject
  length matches IC count, qa_iclabel.csv row sane, params.json
  schema complete. Allows 'ok' or 'ok_low_brain' status (the
  60s/100-iter wiring smoke is structurally below the brain floor).

tests/matlab/run_all_tests.m: adds test_phase3_smoke.

scripts/run_phase3_three_subjects.m: production runner. Regenerates
Phase 1+2 if checkpoints are missing (.set/.fdt are gitignored).

Refs #1, refs #4.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Phase 3: ICLabel classification and non-brain IC rejection

1 participant