What's All the FUSS About Free Universal Sound Separation Data?
Scott Wisdom, Hakan Erdogan, Daniel Ellis, Romain Serizel, (MULTISPEECH), Nicolas Turpault (MULTISPEECH), Eduardo Fonseca, Justin, Salamon, Prem Seetharaman, John Hershey

TL;DR
The paper introduces the FUSS dataset, a large-scale, open-domain sound separation dataset with tools and a baseline model, aiming to facilitate research in separating unknown sound mixtures.
Contribution
It presents the FUSS dataset with diverse sound classes, simulation tools, and an open-source baseline model for variable source separation, advancing sound separation research.
Findings
Baseline model achieves 9.8 dB SI-SNRi on multi-source mixtures.
Dataset covers 357 sound classes with 23 hours of data.
Open-source tools enable diverse mixture generation.
Abstract
We introduce the Free Universal Sound Separation (FUSS) dataset, a new corpus for experiments in separating mixtures of an unknown number of sounds from an open domain of sound types. The dataset consists of 23 hours of single-source audio data drawn from 357 classes, which are used to create mixtures of one to four sources. To simulate reverberation, an acoustic room simulator is used to generate impulse responses of box shaped rooms with frequency-dependent reflective walls. Additional open-source data augmentation tools are also provided to produce new mixtures with different combinations of sources and room simulations. Finally, we introduce an open-source baseline separation model, based on an improved time-domain convolutional network (TDCN++), that can separate a variable number of sources in a mixture. This model achieves 9.8 dB of scale-invariant signal-to-noise ratio…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
