IRCAM RAVE realtime audio VAE nodes for Daydream Scope.
RAVE (Realtime Audio Variational autoEncoder, Caillon & Esling, IRCAM) is a neural audio model that encodes waveforms into a compact latent space and decodes them back. Trained checkpoints cover speech, percussion, orchestral instruments, field recordings, and more; manipulating the latent between encode and decode gives real-time timbre transfer and neural audio effects.
This plugin exposes RAVE to Scope's node graph so you can build patches like:
AudioSource → RAVE Timbre → Sink
or the fully split variant for more control:
AudioSource → RAVE Encode → RAVE Latent Manip → RAVE Decode → Sink
▲
curve / OSC / MIDI
Install as an editable Scope plugin (development):
uv pip install --editable C:/_dev/projects/rave-scope-plugin
Or install from a git URL in Scope's plugin UI:
git+https://github.com/<you>/rave-scope-plugin
Restart Scope after install. The RAVE nodes appear under the rave category.
RAVE models are TorchScript .ts files exported from acids-ircam/RAVE with --streaming. The plugin scans:
~/.daydream-scope/rave-models/*.ts
(overridable via DAYDREAM_SCOPE_RAVE_MODELS_DIR). Drop .ts files into that directory and they show up in the RAVE Loader dropdown.
Sources of pretrained checkpoints (all CC-BY-NC-4.0, non-commercial):
- IRCAM Forum — https://acids-ircam.github.io/rave_models_download
- Direct download:
https://play.forum.ircam.fr/rave-vst-api/get_model/<name> - Names include:
vintage,percussion,VCTK,darbouka_onnx,nasa,sol_ordinario,musicnet
- Direct download:
- HuggingFace — Intelligent-Instruments-Lab/rave-models — https://huggingface.co/Intelligent-Instruments-Lab/rave-models
birds_dawnchorus,voice_multivoice,guitar_iil,organ_bach,water_pondbrain, many more
Example download from the IRCAM forum (curl is fine):
mkdir -p ~/.daydream-scope/rave-models
curl -L "https://play.forum.ircam.fr/rave-vst-api/get_model/vintage" \
-o ~/.daydream-scope/rave-models/vintage.ts
| Node | Purpose |
|---|---|
| RAVE Loader | Load a .ts model; outputs a rave handle |
| RAVE Timbre | One-shot audio → latent-manip → audio (demo headline) |
| RAVE Encode | Audio → RAVE latent |
| RAVE Decode | RAVE latent → audio |
| RAVE Latent Manip | Per-dim bias/scale + additive noise on a latent |
| RAVE Prior | Autonomous generation from the model's prior (--prior exports) |
All audio-producing nodes output at 48 kHz stereo to match Scope's audio bus. The plugin internally resamples to the model's native rate (usually 44.1 or 48 kHz), runs inference in the model's native channel count, and lifts the result back to stereo.
RAVE Timbre and RAVE Latent Manip expose eight per-dim bias_* and scale_* sliders plus a global noise and wet control. RAVE's first ~8 latent dimensions carry the perceptually salient axes (brightness, attack, roughness, vowel colour, etc. — axes vary per model). Sweep them live and the timbre morphs.
This plugin is MIT-licensed. The RAVE code and pretrained models are licensed separately under CC-BY-NC-4.0 (non-commercial use). Do not redistribute the .ts files commercially; users must download their own.
Original RAVE work: Caillon & Esling 2021, arXiv:2111.05011. See github.com/acids-ircam/RAVE and the ACIDS-IRCAM team's publications for background.