[SCHEMA] Support structured survey data as a BIDS modality#2404
[SCHEMA] Support structured survey data as a BIDS modality#2404karl-koschutnig wants to merge 3 commits into
Conversation
Add survey as a first-class BIDS datatype with subject- and session-resolved file structure mirroring other modalities. Changes: - objects/suffixes.yaml: add 'survey' suffix - objects/datatypes.yaml: add 'survey' datatype - objects/modalities.yaml: add 'survey' modality - rules/modalities.yaml: map survey modality to survey datatype - rules/files/raw/survey.yaml: filename rules (sub+task required, ses+run optional) - modality-specific-files/survey.md: documentation chapter - mkdocs.yml: add survey chapter to navigation Reference implementation: https://github.com/MRI-Lab-Graz/prism-studio
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #2404 +/- ##
=======================================
Coverage 83.07% 83.07%
=======================================
Files 22 22
Lines 1696 1696
=======================================
Hits 1409 1409
Misses 287 287 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
| and a guide for using macros can be found at | ||
| https://github.com/bids-standard/bids-specification/blob/master/macros_doc.md | ||
| --> | ||
| {{ MACROS___make_filename_template("raw", datatypes=["survey"]) }} |
There was a problem hiding this comment.
quick one (edited/expanded 2026/05/12):
Overall, I like this approach but immediate question -- is there semantical difference between "phenotype" and "survey" in this PR? if not, I would have preferred to stay consistent and make it phenotype/ (or pheno/) (I think I expressed smth like that elsewhere TODO: find refs here). It would then be a consistent principle we already have overall and distilling more for BEPs, e.g. for stimuli/ (@bids-standard/bep044 ; TODO: find/add refs). See:
- [ENH] BEP044 - Stim-BIDS #2022
- BIDS 2̶.̶0̶1.0: flex BIDS layout (bids-2-devel/issues/54) #1809
- Formalize the concept of
[{leading entities}_]{entity_plural}.{tsv,json}file(s) #2283
and potentially others, one way or another hinting on it, e.g
Then, if there is a desire to generalize "phenotype" into "survey" -- could be done for bids 2.0... WDYT?
Summary
This PR proposes adding schema support for structured instrument-based survey data as a valid BIDS representation, organized in the same subject-, session-, and run-resolved way as other BIDS modalities.
The goal is not to replace aggregated tabular phenotypic data. It is to complement them with a canonical acquisition-facing structure that preserves provenance, timing, and instrument context, while still allowing aggregated tables to be generated later when needed.
A working reference implementation already exists in PRISM Studio, with documentation at prism-studio.readthedocs.io.
Rationale
BIDS is strongest when its canonical structures reflect how data are actually acquired. For imaging, physio, events, and other modalities, the normative pattern is a subject-resolved structure first, with higher-level summaries and
derivatives produced later.
Instrument-based phenotypic data fit this same pattern. Treating
phenotype/as the only primary home for those data flattens acquisition context at the point where BIDS usually preserves it.A structured survey modality would:
That direction is structurally stronger because aggregated tables can be written from a structured survey layout with little ambiguity, whereas reconstructing the original structure from only an aggregate table often requires extra
assumptions.
Proposed Changes
This PR implements a minimal but complete first pass:
src/modality-specific-files/survey.mdsrc/schema/objects/datatypes.yamlsurveydatatypesrc/schema/objects/modalities.yamlsurveymodalitysrc/schema/objects/suffixes.yamlsurveysuffixsrc/schema/rules/files/raw/survey.yamltaskentity template;.tsvand.jsonextensionssrc/schema/rules/modalities.yamlsurveymodality →surveydatatypemkdocs.ymlExample Structure
Below is the subject-, session-, and run-resolved structure this PR supports. A root-level sidecar applies to all matching files via the Inheritance Principle, avoiding duplication across sessions and runs. Per-file sidecars remain valid when instrument versions or languages differ between sessions.
The
phenotype/directory in this example is deliberate. It shows that aggregated outputs remain compatible with this proposal, but they are downstream views of structured data rather than the only canonical form.Reference Implementation
This proposal is grounded in an existing toolchain rather than a purely abstract design discussion:
These references show that the proposed structure is already practical for curation, metadata authoring, template-based reuse, and validation.
Scope
This PR does not propose removing aggregated phenotype tables. It proposes that BIDS should also recognize a canonical modality-style structure for instrument-based phenotypic data, especially when those data are acquired repeatedly across sessions and runs.
Questions for Reviewers
phenotype/remain an optional aggregate or derived representation rather than the only primary representation?surveythe right directory and suffix label, or should the working group prefer another term such aspheno,assess,form,inst, ormeas?taskentity to identify the instrument (e.g.task-pss). Should a dedicatedinstrumententity be considered instead?TaskName,OriginalName,StimulusType,Respondent, etc.) follow existing BIDS conventions — are there missing fields, or any that should move between REQUIRED and OPTIONAL?