phase2: multi-task AMICA concat to lift k-factor (issue #33)#34
Draft
neuromechanist wants to merge 1 commit into
Draft
phase2: multi-task AMICA concat to lift k-factor (issue #33)#34neuromechanist wants to merge 1 commit into
neuromechanist wants to merge 1 commit into
Conversation
ThePresent alone gives k = samples/n_chans^2 = 1.1 for 128-channel ICA; standard ICA wants k >= 20. Under-determined AMICA produces components that collapse to single electrodes (visual inspection on PR #31's 3 subjects confirms). Reference pipeline mitigates by concatenating multiple HBN tasks for ICA training while keeping ThePresent as the analysis target. Adds IcaTasks opt to phase2_amica (default: the four passive-viewing movie tasks DespicableMe, DiaryOfAWimpyKid, FunwithFractals, ThePresent). When |IcaTasks| > 1: ensure Phase 1 .set exists for each task (regenerate via phase1_preprocess if missing), load all, restrict to common channel intersection, pop_mergeset, train AMICA on merged data, transplant weights onto the Task=ThePresent EEG via hbn.apply_ica_weights, then dipfit + figures + checkpoint as before. src/matlab/phase2_amica.m - IcaTasks (1,:) string default 4 movie tasks. Set to [Task] for the legacy single-task path (smoke test uses this). - BidsRoot opt added so prepare_multitask_amica_input can regenerate missing Phase 1 .set files. - qaRow now carries ica_tasks, ica_samples, k_factor, common_channels. src/matlab/+hbn/prepare_multitask_amica_input.m (new) - For each task in IcaTasks: load Phase 1 .set or regenerate via phase1_preprocess. Intersect channel labels across tasks (>= 32 channels required). pop_select to intersection. pop_mergeset. Returns merged EEG + info struct. src/matlab/+hbn/apply_ica_weights.m (new) - Copies icaweights/icasphere/icachansind/icawinv from sourceEEG to targetEEG, asserting channel-count match. After multi-task AMICA the target ThePresent EEG must already be on the same channel intersection (pop_select before the call). src/matlab/+hbn/write_qa_amica_csv.m - Schema gains ica_tasks, ica_samples, k_factor, common_channels. Insertion-order placement keeps participant_id, status, ica_method as the leftmost columns for readability. tests/matlab/test_phase2_smoke.m - Pass IcaTasks="ThePresent" so the fixture-based smoke test stays on the single-task path (the fixture only has ThePresent). - requiredFields includes IcaTasks. scripts/run_phase2_three_subjects_multitask.m (new) - Production runner. Regenerates Phase 1 ThePresent .set if missing. Per-task .sets for the other 3 movie tasks are produced lazily by prepare_multitask_amica_input. Refs #1, refs #33.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Refines Phase 2 to fix the under-determined ICA on PR #31. ThePresent alone gives k = samples / n_chans² = 1.1; the resulting AMICA components collapse to single electrodes (visual inspection confirmed). Reference pipeline mitigates by concatenating multiple HBN tasks for the ICA training pool while keeping ThePresent as the analysis target.
This PR adds an
IcaTasksoption tophase2_amica(default = the four passive-viewing movie tasks:DespicableMe,DiaryOfAWimpyKid,FunwithFractals,ThePresent). Lifts k from 1.1 to ≈3.5 on the 100 Hz local data.Refs epic #1, closes #33. Blocks PR #32 (Phase 3 derivatives) until merged.
What changes
src/matlab/phase2_amica.m:IcaTasksopt. When multi-task: load all per-task Phase 1.setfiles, restrict to channel intersection,pop_mergeset, train AMICA on merged, transplant weights onto Task=ThePresent EEG.src/matlab/+hbn/prepare_multitask_amica_input.m(new): per-subject multi-task pre-merge step. Regenerates missing Phase 1.setper task viaphase1_preprocess.src/matlab/+hbn/apply_ica_weights.m(new): copiesicaweights/icasphere/icachansind/icawinvfrom training EEG onto target EEG; asserts channel-count match.src/matlab/+hbn/write_qa_amica_csv.m: schema gainsica_tasks,ica_samples,k_factor,common_channels.tests/matlab/test_phase2_smoke.m: passesIcaTasks="ThePresent"so the fixture-based smoke test stays on the single-task path (fixture has only ThePresent).scripts/run_phase2_three_subjects_multitask.m(new): production 3-subject runner.Why this isn't a brief violation
CLAUDE.md "Never generalize to other HBN movies in this project (ThePresent only)" is about the analysis scope; the contrast, epoching, ERSP, and stats all stay ThePresent-only. The reference pipeline
study_handy_scripts.mexplicitly usestask_group = ["surroundSupp", "RestingState", "DespicableMe", "ThePresent", "FunwithFractals", "DiaryOfAWimpyKid"]for the ICA training pool andpop_mergesetbeforerunamica17_nsg. We are doing the same, scoped to the four movie tasks for ecological consistency.Local verification
(Single-task fallback path; the wiring smoke test only exercises one task.)
Test plan
Status
Draft. PR body will be updated with derivatives + QA review once the 3-subject run completes.