LAION-fMRI benchmark#2394
Open
KartikP wants to merge 8 commits into
Open
Conversation
- Switch regression metric to dual_{ridge,ridgecv}_split (kernel form)
so wide-feature models don't materialize the (n_features, n_targets)
coefficient matrix. Mathematically identical to the prior standard
ridge for the existing -ridge cells; resolves a memory cliff on the
persubject pool for models like resnext101.
- Register 16 new -ridgecv variants alongside the existing 16 -ridge
variants (32 ridge cells + 4 RSA cells = 36 headline total). -ridge
keeps fixed alpha=1; -ridgecv selects alpha via internal CV over a
21-value log-spaced sweep (1e-10 to 1e10) defined locally as
LAION_ALPHA_LIST. Sweep stays benchmark-local so Gifford / Papale /
Hebart_fmri are unaffected.
- Trim README and METHODS to a minimal description of what the
benchmark exposes; move long architectural rationale out of band.
Test count updated 20 -> 36. All data-free tests pass.
Drop the 16 fixed-alpha -ridge registry entries; the 16 corresponding -ridgecv variants now carry the headline scoring. Total headline cells: 20 (16 ridgecv + 4 rdm-pearson), matching the original registry surface. Fixed-alpha ridge stays accessible via the factory with metric_type='ridge' for anyone who wants the faster fixed-alpha fit. Factory default switched ridge -> ridgecv to match the new registered defaults. Test suite updated to enumerate ridgecv identifiers and expect the 20-variant count.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a new vision benchmark family scoring models against the LAION-fMRI 7T dataset (Zerbe et al., VSS 2026). This dataset is a densely-sampled multi-subject fMRI dataset spanning broad natural-image diversity with built-in OOD stress tests. 20 registered "headline" variants across the shared and per-subject stimulus pool, plus a thin factory API for non-headline variants (cluster CV, per-OOD-category, etc.)
Also introduces a reusable multi-subject benchmark scaffold in
benchmark_helpers/and wires every wrapper into the newbootstrap_errorandvalidate_errorhelpers.Dataset
benchmark.py.tau,ood, 9ood_<category>variants,cluster_k5_{0..4}20 registered headline variants — identifier pattern
{family}.{region}-{split}-{metric}:LAION_fMRI(shared)LAION_fMRI_persubjectLAION_fMRI(shared)Non-headline variants (per-OOD-category,
cluster_k5,IT_fullablation) accessible via factory API — seeusage_examples.ipynb.Reusable helpers (extracted, not laion-specific)
brainscore_vision/benchmark_helpers/multi_subject.py:MultiSubjectNeuralBenchmark— per-subject TrainTest aggregator with per-subject detail preserved inscore.attrsKFoldNeuralBenchmark— average across k folds (any underlying benchmark), per-fold detail preservedblock_diagonal_concat— stitch per-subject slices into(presentation × neuroid)with each subject on its own diagonal block; off-diagonal NaNThese exist as standalone helpers so other multi-subject / k-fold benchmarks can reuse them without copy-paste.
Uncertainty reporting
Every laion_fmri wrapper now returns a
Scorewith:error(finite SE on the ceiled scale, computed via bootstrap over subjects/folds withn_bootstrap=200)error_over,n_bootstrapprovenancerawdisaggregated per-unit scoresdeclare_no_errorwith reasonTestUncertaintyContractintest.pyverifies the gate passes.Architectural decisions worth flagging
.ncis registered separately; the benchmark loader iterates + filters + concatenates the small slices.internal_consistency's split-half Pearson. Sidesteps cross-subject NaN-padding incompatibility, and uses the publication-grade estimator the LAION-fMRI authors provide.ITdefinition matches NSD / Algonauts 2023'sstreams_ventral(laion-ventral \ retinotopic).IT_full(V4 ∪ IT) is exposed as a non-headline alias approximating the authors' broader ventral mask.LAIONfMRI(..., subjects=DEFAULT_SUBJECTS)is called externally; the headline registry takes the per-subject MultiSubject path for dense per-subject regression.Reproduction / verification
rebuild_assemblies.pysemantic-verify step — bit-equivalent data + every coord element-wise vs published S3usage_examples.ipynbcovers all 20 headline variants + 5 non-headline patterns end-to-endBaseline sweep (5 models × 20 cells)
All values are ceiled scores (mean per-voxel correlation / ncsnr-derived ceiling, averaged across 5 subjects).
—= not run.LAION_fMRI(shared 1,492-stim pool)LAION_fMRI_persubject(5,833 stim/subject)Heads up on the upstream pin
requirements.txttemporarily pinslaion-fmrito my fork (KartikP/LAION-fMRI@fix/duplicate-force-include) because the upstreampyproject.tomlhas a redundant[tool.hatch.build.targets.wheel.force-include]that breaks wheel builds with hatchling 1.18+.Will revert to
ViCCo-Group/LAION-fMRI.git@mainonce upstream merges