LAION-fMRI benchmark by KartikP · Pull Request #2394 · brain-score/vision

KartikP · 2026-05-30T13:31:16Z

Adds a new vision benchmark family scoring models against the LAION-fMRI 7T dataset (Zerbe et al., VSS 2026). This dataset is a densely-sampled multi-subject fMRI dataset spanning broad natural-image diversity with built-in OOD stress tests. 20 registered "headline" variants across the shared and per-subject stimulus pool, plus a thin factory API for non-headline variants (cluster CV, per-OOD-category, etc.)

Also introduces a reusable multi-subject benchmark scaffold in benchmark_helpers/ and wires every wrapper into the new bootstrap_error and validate_error helpers.

Dataset

Citation: Zerbe, Roth, Mell, Herholz, Knapen, Hebart (VSS 2026). BibTeX in benchmark.py.
Subjects: 5 (sub-01, sub-03, sub-05, sub-06, sub-07)
Stimuli: 9.2 × 9.2 DVA (1000×1000 px on a BenQ-mirrored PROpixx projector at ~165 cm), DUA-gated — not redistributed by Brain-Score
Pools:
- Shared (Allen2022-style): 1,492 stimuli seen by every subject
- Per-subject: 5,833 stimuli/subject (1,121 shared non-OOD + 4,712 unique + 371 OOD)
Splits (bundled by the LAION-fMRI re:vision initiative): tau, ood, 9 ood_<category> variants, cluster_k5_{0..4}

20 registered headline variants — identifier pattern {family}.{region}-{split}-{metric}:

Family	Regions	Splits	Metric	Count
`LAION_fMRI` (shared)	V1, V2, V4, IT	tau, ood	ridge	8
`LAION_fMRI_persubject`	V1, V2, V4, IT	tau, ood	ridge	8
`LAION_fMRI` (shared)	V1, V2, V4, IT	(no split)	rdm-pearson	4

Non-headline variants (per-OOD-category, cluster_k5, IT_full ablation) accessible via factory API — see usage_examples.ipynb.

Reusable helpers (extracted, not laion-specific)

brainscore_vision/benchmark_helpers/multi_subject.py:

MultiSubjectNeuralBenchmark — per-subject TrainTest aggregator with per-subject detail preserved in score.attrs
KFoldNeuralBenchmark — average across k folds (any underlying benchmark), per-fold detail preserved
block_diagonal_concat — stitch per-subject slices into (presentation × neuroid) with each subject on its own diagonal block; off-diagonal NaN

These exist as standalone helpers so other multi-subject / k-fold benchmarks can reuse them without copy-paste.

Uncertainty reporting

Every laion_fmri wrapper now returns a Score with:

error (finite SE on the ceiled scale, computed via bootstrap over subjects/folds with n_bootstrap=200)
error_over, n_bootstrap provenance
raw disaggregated per-unit scores
Single-subject leaf path uses declare_no_error with reason

TestUncertaintyContract in test.py verifies the gate passes.

Architectural decisions worth flagging

Per-subject assembly storage instead of one block-diagonal monolith. Cross-subject concat-with-NaN ballooned to >5GB during development and triggered OOMs. Each subject's .nc is registered separately; the benchmark loader iterates + filters + concatenates the small slices.
ncsnr-based ceiling using the dataset's published noise-ceiling-SNR per voxel, instead of internal_consistency's split-half Pearson. Sidesteps cross-subject NaN-padding incompatibility, and uses the publication-grade estimator the LAION-fMRI authors provide.
IT definition matches NSD / Algonauts 2023's streams_ventral (laion-ventral \ retinotopic). IT_full (V4 ∪ IT) is exposed as a non-headline alias approximating the authors' broader ventral mask.
Block-diagonal helper for the cross-subject shared pool — used only when LAIONfMRI(..., subjects=DEFAULT_SUBJECTS) is called externally; the headline registry takes the per-subject MultiSubject path for dense per-subject regression.

Reproduction / verification

Reproducibility tested via rebuild_assemblies.py semantic-verify step — bit-equivalent data + every coord element-wise vs published S3
usage_examples.ipynb covers all 20 headline variants + 5 non-headline patterns end-to-end

Baseline sweep (5 models × 20 cells)

All values are ceiled scores (mean per-voxel correlation / ncsnr-derived ceiling, averaged across 5 subjects). — = not run.

`LAION_fMRI` (shared 1,492-stim pool)

region-split	alexnet_random	alexnet	convnext_tiny	resnext101_32x8d_wsl	resnet50_tutorial
V1-tau	0.242	0.334	0.393	0.388	0.462
V1-ood	0.164	0.231	0.321	0.311	0.367
V1-rdm	—	0.303	0.204	—	0.172
V2-tau	0.216	0.335	0.372	0.386	0.381
V2-ood	0.182	0.290	0.316	0.350	0.346
V2-rdm	—	0.285	0.204	—	0.143
V4-tau	0.097	0.277	0.271	0.294	0.300
V4-ood	0.089	0.294	0.311	0.360	0.353
V4-rdm	—	0.208	0.124	—	0.111
IT-tau	0.048	0.136	0.237	0.222	0.284
IT-ood	0.015	0.079	0.168	0.172	0.206
IT-rdm	—	0.374	0.167	—	0.293

`LAION_fMRI_persubject` (5,833 stim/subject)

region-split	alexnet_random	alexnet	convnext_tiny	resnext101_32x8d_wsl	resnet50_tutorial
V1-tau	0.157	0.147	0.232	0.233	0.296
V1-ood	0.183	0.156	0.263	0.281	0.340
V2-tau	0.157	0.190	0.202	0.245	0.243
V2-ood	0.224	0.230	0.262	0.310	0.324
V4-tau	0.055	0.139	0.114	0.167	0.172
V4-ood	0.097	0.223	0.225	0.310	0.309
IT-tau	0.053	0.031	0.091	0.139	0.188
IT-ood	0.034	0.019	0.084	—	0.168

Heads up on the upstream pin

requirements.txt temporarily pins laion-fmri to my fork (KartikP/LAION-fMRI@fix/duplicate-force-include) because the upstream pyproject.toml has a redundant [tool.hatch.build.targets.wheel.force-include] that breaks wheel builds with hatchling 1.18+.

Will revert to ViCCo-Group/LAION-fMRI.git@main once upstream merges

- Switch regression metric to dual_{ridge,ridgecv}_split (kernel form) so wide-feature models don't materialize the (n_features, n_targets) coefficient matrix. Mathematically identical to the prior standard ridge for the existing -ridge cells; resolves a memory cliff on the persubject pool for models like resnext101. - Register 16 new -ridgecv variants alongside the existing 16 -ridge variants (32 ridge cells + 4 RSA cells = 36 headline total). -ridge keeps fixed alpha=1; -ridgecv selects alpha via internal CV over a 21-value log-spaced sweep (1e-10 to 1e10) defined locally as LAION_ALPHA_LIST. Sweep stays benchmark-local so Gifford / Papale / Hebart_fmri are unaffected. - Trim README and METHODS to a minimal description of what the benchmark exposes; move long architectural rationale out of band. Test count updated 20 -> 36. All data-free tests pass.

Drop the 16 fixed-alpha -ridge registry entries; the 16 corresponding -ridgecv variants now carry the headline scoring. Total headline cells: 20 (16 ridgecv + 4 rdm-pearson), matching the original registry surface. Fixed-alpha ridge stays accessible via the factory with metric_type='ridge' for anyone who wants the faster fixed-alpha fit. Factory default switched ridge -> ridgecv to match the new registered defaults. Test suite updated to enumerate ridgecv identifiers and expect the 20-variant count.

KartikP added 5 commits May 30, 2026 09:30

Full pipeline and benchmark

fb99e81

Merge branch 'master' into kp/laion-fmri

e9dbf88

Fix dependencies

2ce6547

Hotfix until core (kp/fail-loud-plugin-env-install)

28d203b

Robust error

f350a67

KartikP closed this Jun 1, 2026

KartikP reopened this Jun 1, 2026

KartikP closed this Jun 1, 2026

KartikP reopened this Jun 1, 2026

KartikP closed this Jun 2, 2026

KartikP reopened this Jun 2, 2026

KartikP closed this Jun 2, 2026

KartikP reopened this Jun 2, 2026

Use laion-fmri wheel error branch

38eeb41

KartikP closed this Jun 2, 2026

KartikP reopened this Jun 2, 2026

KartikP closed this Jun 2, 2026

KartikP reopened this Jun 2, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LAION-fMRI benchmark#2394

LAION-fMRI benchmark#2394
KartikP wants to merge 8 commits into
masterfrom
kp/laion-fmri

KartikP commented May 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KartikP commented May 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dataset

Reusable helpers (extracted, not laion-specific)

Uncertainty reporting

Architectural decisions worth flagging

Reproduction / verification

Baseline sweep (5 models × 20 cells)

LAION_fMRI (shared 1,492-stim pool)

LAION_fMRI_persubject (5,833 stim/subject)

Heads up on the upstream pin

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

KartikP commented May 30, 2026 •

edited

Loading

`LAION_fMRI` (shared 1,492-stim pool)

`LAION_fMRI_persubject` (5,833 stim/subject)