Skip to content
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
e9500ed
Add common media file definitions (suffixes, extensions, metadata, si…
yarikoptic Mar 18, 2026
bd55318
Fix table separator padding for remark-lint compliance
yarikoptic Mar 19, 2026
0a9addd
Reuse existing RecordingDuration instead of introducing Duration
yarikoptic Mar 21, 2026
4faad34
Document AudioSampleRate vs SamplingFrequency distinction
yarikoptic Mar 21, 2026
56be0f6
Include .tif alongside .tiff in image formats table
yarikoptic Mar 21, 2026
8381389
Document relationship between media files and existing photo suffix
yarikoptic Mar 21, 2026
311e335
Render media metadata tables from schema using macros
yarikoptic Mar 23, 2026
4267efe
Render media suffix definitions from schema using macro
yarikoptic Mar 23, 2026
96dca84
Add make_extension_table macro and use it for media format tables
yarikoptic Mar 23, 2026
933b390
Add test for make_extension_table macro
yarikoptic Mar 24, 2026
be841b7
Remove overspecification for "photo" and clarify on variable rate
yarikoptic Jun 3, 2026
aba8721
Clarify Width/Height and add PixelFormat
yarikoptic Jun 3, 2026
a5b7aea
Add VideoFrameCount; prefix-align FrameRate, Width, Height
yarikoptic Jun 3, 2026
e576fea
Rename PixelFormat to ImagePixelFormat and move to MediaImageProperties
yarikoptic Jun 3, 2026
399713d
Minor wording tune up on the choices
yarikoptic Jun 3, 2026
c29ed8b
Add ImageBitDepth (OPTIONAL) under MediaImageProperties
yarikoptic Jun 3, 2026
6bf8f12
Merge branch 'master' into mediafiles
yarikoptic Jun 3, 2026
71b33c9
Add flac extension and AudioBitDepth to common media definitions
bendichter Jun 4, 2026
1330ad9
List flac in the media-files appendix audio formats table
bendichter Jun 4, 2026
811b8fc
Merge pull request #3 from bendichter/media-extra-formats
yarikoptic Jun 4, 2026
f49c5d9
Merge branch 'master' into mediafiles
bendichter Jun 11, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ nav:
- Coordinate systems: appendices/coordinate-systems.md
- Quantitative MRI: appendices/qmri.md
- Arterial Spin Labeling: appendices/arterial-spin-labeling.md
- Media files: appendices/media-files.md
- Cross modality correspondence: appendices/cross-modality-correspondence.md
- Changelog: CHANGES.md
- The BIDS Website:
Expand Down
184 changes: 184 additions & 0 deletions src/appendices/media-files.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
# Media Files

## Introduction

Several BIDS datatypes make use of media files — audio recordings, video recordings,
combined audio-video recordings, and still images.
This appendix defines the common file formats, metadata conventions,
and codec identification schemes shared across all datatypes that use media files.

Datatypes that incorporate media files (for example, behavioral recordings or stimuli)
define their own file-naming rules, directory placement, and datatype-specific metadata.
The conventions described here apply uniformly to all such datatypes.

### Relationship to the `photo` suffix

The media file definitions introduced here generalize the concept of all media in BIDS.
The existing `photo` suffix (used for photographs of anatomical landmarks,
head localization coils, and tissue samples) predates this framework and covers
a narrower use case — still images in specific electrophysiology and microscopy datatypes.

The media suffixes (`audio`, `video`, `audiovideo`, `image`) are intended as the
general-purpose mechanism for all media content in BIDS.
In practice, a "photo" could equally be a video of an experimental setup with verbal
narration, an audio recording describing electrode placement, or a drawing rather than
a photograph.
Comment thread
yarikoptic marked this conversation as resolved.
Outdated
The media file framework should be generally adopted for new datatypes,
and a future proposal may deprecate the `photo` suffix in favor of the broader `image`
suffix with appropriate migration tooling
(see [bids-utils](https://github.com/bids-standard/bids-utils)).

## Supported Formats

### Audio formats

| Format | Extension | Description |
| ---------------------- | --------- | --------------------------------------------- |
| Waveform Audio (WAV) | `.wav` | Uncompressed PCM audio; lossless, large files |
| MP3 | `.mp3` | Lossy compressed audio; widely supported |
| Advanced Audio Coding | `.aac` | Lossy compressed audio; successor to MP3 |
| Ogg Vorbis | `.ogg` | Open lossy compressed audio format |
Comment thread
yarikoptic marked this conversation as resolved.
Outdated

### Video container formats

| Format | Extension | Description |
| ---------------------- | --------- | ---------------------------------------- |
| MPEG-4 Part 14 | `.mp4` | Widely supported multimedia container |
| Audio Video Interleave | `.avi` | Legacy multimedia container |
| Matroska | `.mkv` | Open, flexible multimedia container |
| WebM | `.webm` | Open format optimized for web delivery |

### Image formats

| Format | Extension | Description |
| ------------------------- | --------------- | -------------------------------------------- |
| JPEG | `.jpg` | Lossy compressed photographic images |
| Portable Network Graphics | `.png` | Lossless compressed images with transparency |
| Scalable Vector Graphics | `.svg` | XML-based vector image format |
| WebP | `.webp` | Modern format supporting lossy and lossless |
| Tag Image File Format | `.tif`, `.tiff` | Lossless format common in scientific imaging |

When choosing a format, consider the trade-off between file size and data fidelity.
Comment thread
yarikoptic marked this conversation as resolved.
Outdated
Uncompressed or lossless formats (WAV, PNG, TIFF) preserve full quality
but produce larger files.
Lossy formats (MP3, AAC, JPEG) significantly reduce file size
at the cost of some data loss.

## Media Stream Metadata

Media files SHOULD be accompanied by a JSON sidecar file
containing technical metadata about the media streams.
The following metadata fields are defined for media files:

### Duration

| Field | Suffix | Requirement Level |
| ------------------- | ------------------------------ | ----------------- |
| `RecordingDuration` | `audio`, `video`, `audiovideo` | RECOMMENDED |

`RecordingDuration` is the total duration of the media file in seconds.
This reuses the existing BIDS metadata field already defined for
electrophysiology recordings (EEG, iEEG, MEG, and others).

### Audio stream properties

| Field | Suffix | Requirement Level |
| ------------------- | --------------------- | ----------------- |
| `AudioCodec` | `audio`, `audiovideo` | RECOMMENDED |
| `AudioSampleRate` | `audio`, `audiovideo` | RECOMMENDED |
| `AudioChannelCount` | `audio`, `audiovideo` | RECOMMENDED |
| `AudioCodecRFC6381` | `audio`, `audiovideo` | OPTIONAL |
Comment thread
yarikoptic marked this conversation as resolved.
Outdated

Note: `AudioSampleRate` is used instead of the existing `SamplingFrequency` field
because audio-video files require distinguishing the audio sampling rate from the
video frame rate. The `Audio` prefix makes this unambiguous in multi-stream containers.

### Visual properties
Comment thread
yarikoptic marked this conversation as resolved.
Outdated

| Field | Suffix | Requirement Level |
| -------- | ----------------------------------- | ----------------- |
| `Width` | `video`, `audiovideo`, `image` | RECOMMENDED |
| `Height` | `video`, `audiovideo`, `image` | RECOMMENDED |

### Video stream properties
Comment thread
yarikoptic marked this conversation as resolved.

Comment thread
yarikoptic marked this conversation as resolved.
| Field | Suffix | Requirement Level |
| ------------------- | --------------------- | ----------------- |
| `VideoCodec` | `video`, `audiovideo` | RECOMMENDED |
| `FrameRate` | `video`, `audiovideo` | RECOMMENDED |

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The proposal includes FrameRate as a recommended field, but it should clarify how to handle variable frame rate (VFR) video. With constant frame rate, a single number is sufficient and any frame's timestamp can be computed as frame_number / frame_rate. With VFR, that arithmetic breaks down and each frame needs an explicit timestamp to be aligned with data on other recordings.

The spec should indicate whether FrameRate is expected to be the average rate, the nominal rate, or undefined for VFR files, and whether a boolean field like VariableFrameRate should accompany it so that downstream tools know they cannot rely on uniform spacing.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial progress: the field is now VideoFrameRate (renamed in a5b7aea for prefix consistency) and its description says "For variable rate videos, this value should be the nominal frame rate." (be841b7, line-wrapped in aba8721). Still open from your original ask: a separate VariableFrameRate: boolean flag so downstream tools can short-circuit without parsing the description. Do you think the nominal-rate convention alone is sufficient, or do you still want the explicit boolean? If the latter, happy to add it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's nice to have both the approximate framerate and the VariableFrameRate: bool for that case

| `VideoCodecRFC6381` | `video`, `audiovideo` | OPTIONAL |

Comment thread
yarikoptic marked this conversation as resolved.
## Codec Identification

Codec identification uses two complementary naming systems:

### FFmpeg codec names (RECOMMENDED)

The `AudioCodec` and `VideoCodec` fields use
[FFmpeg codec names](https://www.ffmpeg.org/ffmpeg-codecs.html) as the RECOMMENDED
convention. These names are the de facto standard in scientific computing and can be
auto-extracted from media files using:

```bash
ffprobe -v quiet -print_format json -show_streams <file>
```

### RFC 6381 codec strings (OPTIONAL)

The `AudioCodecRFC6381` and `VideoCodecRFC6381` fields use
[RFC 6381](https://datatracker.ietf.org/doc/html/rfc6381) codec strings.
These provide precise codec profile and level information useful for
web and broadcast interoperability.

### Common codec reference

| Codec | FFmpeg Name | RFC 6381 String | Notes |
| -------------- | ----------- | ------------------ | ----------------------- |
| H.264 / AVC | `h264` | `avc1.640028` | Most widely supported |
| H.265 / HEVC | `hevc` | `hev1.1.6.L93.B0` | High efficiency |
| VP9 | `vp9` | `vp09.00.10.08` | Open, royalty-free |
| AV1 | `av1` | `av01.0.01M.08` | Next-gen open codec |
| AAC-LC | `aac` | `mp4a.40.2` | Default audio for MP4 |
| MP3 | `mp3` | `mp4a.6B` | Legacy lossy audio |
| Opus | `opus` | `Opus` | Open, low-latency audio |
| FLAC | `flac` | `fLaC` | Open lossless audio |
| PCM 16-bit LE | `pcm_s16le` | — | Uncompressed (WAV) |

The FFmpeg name column shows the value to use for `VideoCodec` or `AudioCodec`.
The RFC 6381 column shows the value for `VideoCodecRFC6381` or `AudioCodecRFC6381`.
RFC 6381 strings vary by profile and level;
the values shown are representative examples.

## Privacy Considerations

Media files — particularly audio and video recordings — may contain
personally identifiable information (PII), including but not limited to:

- Voices and speech content
- Facial features and other physical characteristics
- Background environments that could identify locations
- Metadata embedded in file headers (for example, GPS coordinates, device identifiers)

Researchers MUST ensure that sharing of media files complies with the
informed consent obtained from participants and with applicable privacy regulations.
De-identification techniques (for example, voice distortion, face blurring,
metadata stripping) SHOULD be applied where appropriate before data sharing.
Comment thread
bendichter marked this conversation as resolved.

## Example

A complete sidecar JSON file for an audio-video recording:

```json
{
"RecordingDuration": 312.5,
"VideoCodec": "h264",
"VideoCodecRFC6381": "avc1.640028",
"FrameRate": 30,
"Width": 1920,
"Height": 1080,
"AudioCodec": "aac",
"AudioCodecRFC6381": "mp4a.40.2",
"AudioSampleRate": 48000,
"AudioChannelCount": 2
}
```
62 changes: 62 additions & 0 deletions src/schema/objects/extensions.yaml
Original file line number Diff line number Diff line change
@@ -1,12 +1,24 @@
---
# This file describes valid file extensions in the specification.
aac:
value: .aac
display_name: Advanced Audio Coding
description: |
An [Advanced Audio Coding](https://en.wikipedia.org/wiki/Advanced_Audio_Coding)
audio file.
ave:
value: .ave
display_name: AVE # not sure what ave stands for
description: |
File containing data averaged by segments of interest.

Used by KIT, Yokogawa, and Ricoh MEG systems.
avi:
value: .avi
display_name: Audio Video Interleave
description: |
An [Audio Video Interleave](https://en.wikipedia.org/wiki/Audio_Video_Interleave)
media container file.
bdf:
value: .bdf
display_name: Biosemi Data Format
Expand Down Expand Up @@ -153,6 +165,22 @@ md:
display_name: Markdown
description: |
A Markdown file.
mkv:
value: .mkv
display_name: Matroska Video
description: |
A [Matroska](https://www.matroska.org/) media container file.
mp3:
value: .mp3
display_name: MP3 Audio
description: |
An [MP3](https://en.wikipedia.org/wiki/MP3) audio file.
mp4:
value: .mp4
display_name: MPEG-4 Part 14
description: |
An [MPEG-4 Part 14](https://en.wikipedia.org/wiki/MP4_file_format)
media container file.
mefd:
value: .mefd/
display_name: Multiscale Electrophysiology File Format Version 3.0
Expand Down Expand Up @@ -201,6 +229,12 @@ nwb:
A [Neurodata Without Borders](https://nwb-schema.readthedocs.io/en/latest/) file.

Each recording consists of a single `.nwb` file.
ogg:
value: .ogg
display_name: Ogg Vorbis
description: |
An [Ogg](https://en.wikipedia.org/wiki/Ogg) audio file,
typically containing Vorbis-encoded audio.
OMEBigTiff:
value: .ome.btf
display_name: Open Microscopy Environment BigTIFF
Expand Down Expand Up @@ -249,6 +283,11 @@ snirf:
display_name: Shared Near Infrared Spectroscopy Format
description: |
HDF5 file organized according to the [SNIRF specification](https://github.com/fNIRS/snirf)
svg:
value: .svg
display_name: Scalable Vector Graphics
description: |
A [Scalable Vector Graphics](https://en.wikipedia.org/wiki/SVG) image file.
sqd:
value: .sqd
display_name: SQD
Expand All @@ -263,6 +302,12 @@ tif:
display_name: Tag Image File Format
description: |
A [Tag Image File Format](https://en.wikipedia.org/wiki/TIFF) file.
tiff:
value: .tiff
display_name: Tag Image File Format
description: |
A [Tag Image File Format](https://en.wikipedia.org/wiki/TIFF) image file.
The `.tiff` extension is the long form of `.tif`.
trg:
value: .trg
display_name: KRISS TRG
Expand Down Expand Up @@ -307,6 +352,23 @@ vmrk:
A text marker file in the
[BrainVision Core Data Format](https://www.brainproducts.com/support-resources/brainvision-core-data-format-1-0/).
These files come in three-file sets, including a `.vhdr`, a `.vmrk`, and a `.eeg` file.
wav:
value: .wav
display_name: Waveform Audio
description: |
A [Waveform Audio File Format](https://en.wikipedia.org/wiki/WAV)
audio file, typically containing uncompressed PCM audio.
webm:
value: .webm
display_name: WebM
description: |
A [WebM](https://www.webmproject.org/) media container file,
typically containing VP8/VP9 video and Vorbis/Opus audio.
webp:
value: .webp
display_name: WebP Image
description: |
A [WebP](https://en.wikipedia.org/wiki/WebP) image file.
Any:
value: .*
display_name: Any Extension
Expand Down
Loading