OpenBEATs

OpenBEATs is a general-purpose audio encoder pre-trained on speech, music, environmental sound, and bioacoustics. This package runs it on audio and returns patch-level embeddings, plus class probabilities when a fine-tuned checkpoint is used.

Install

pip install openbeats

This adds two commands, openbeats-infer and openbeats-download. The dependencies are kept light (torch, torchaudio, numpy, huggingface-hub, pyyaml, soundfile), and torch is pinned loosely so an existing build is not replaced. To avoid touching an existing environment, install it in its own with uv or pipx:

uv tool install openbeats     # or: pipx install openbeats

Usage

From the command line

Handy for a quick look:

openbeats-infer --checkpoint espnet/OpenBEATS-Large-i2-as20k \
    --audio audio.wav --out embeddings.npz

--checkpoint takes a Hugging Face repo id (downloaded automatically), a local directory, or a checkpoint file. The .npz holds patch_embeddings (num_patches, 1024), plus logits and probs when the checkpoint has a classifier. Other options: --device cuda, --max-layer N, and --chunk-seconds 10 for long recordings.

From Python

from openbeats.model import OpenBeats
from openbeats.utils import load_audio

# load model
model = OpenBeats.from_pretrained("espnet/OpenBEATS-Large-i2-as20k", device="cuda")

# from a file with any sample rate
out = model.encode_file("audio.wav")              # pass chunk_seconds=10 for long audio

# or load the waveform in 16khz monoaural array with values in [-1,1]
wav, sr = load_audio("audio.wav")
# and pass it
out = model.encode(wav, sr)

print(out["patch_embeddings"].shape)               # (num_patches, 1024)

Checkpoints

The variants (Base and Large, plus AudioSet and bioacoustics fine-tunes) live in the espnet OpenBEATs collection.

Citation

If you use OpenBEATs, please cite:

@INPROCEEDINGS{11230965,
  author={Bharadwaj, Shikhar and Cornell, Samuele and Choi, Kwanghee and Fukayama, Satoru and Shim, Hye-Jin and Deshmukh, Soham and Watanabe, Shinji},
  booktitle={2025 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
  title={OpenBEATs: A Fully Open-Source General-Purpose Audio Encoder},
  year={2025},
  volume={},
  number={},
  pages={1-5},
  keywords={Training;Representation learning;Codes;Conferences;Pipelines;Signal processing;Cognition;Robustness;Reproducibility of results;Question answering (information retrieval)},
  doi={10.1109/WASPAA66052.2025.11230965}}

If you use the checkpoints trained for our ICME 2025 Audio Encoder Challenge submission, please also cite:

@article{bharadwaj2026cmu,
  title={The CMU-AIST submission for the ICME 2025 Audio Encoder Challenge},
  author={Bharadwaj, Shikhar and Cornell, Samuele and Choi, Kwanghee and Shim, Hye-jin and Deshmukh, Soham and Fukayama, Satoru and Watanabe, Shinji},
  journal={arXiv preprint arXiv:2601.16273},
  year={2026}
}

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
src/openbeats		src/openbeats
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenBEATs

Install

Usage

From the command line

From Python

Checkpoints

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenBEATs

Install

Usage

From the command line

From Python

Checkpoints

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages