Fast accuracy estimation of deep learning based multi-class musical   source separation

Alexandru Mocanu; Benjamin Ricaud; Milos Cernak

arXiv:2010.09453·cs.SD·December 2, 2021

Fast accuracy estimation of deep learning based multi-class musical source separation

Alexandru Mocanu, Benjamin Ricaud, Milos Cernak

PDF

Open Access

TL;DR

This paper introduces a rapid, dataset-agnostic method to estimate the separability of musical instruments in audio recordings, aiding efficient neural network training without extensive data collection or model tuning.

Contribution

It proposes an oracle-based separability measure that accurately predicts deep learning separation performance, revealing limitations of waveform-based methods and the effectiveness of the ideal ratio mask.

Findings

01

The ideal ratio mask provides an accurate proxy for separation performance.

02

Waveform-based methods like TasNet face similar limitations as TF-based methods.

03

The proposed measure enables efficient dataset and sample selection for training.

Abstract

Music source separation represents the task of extracting all the instruments from a given song. Recent breakthroughs on this challenge have gravitated around a single dataset, MUSDB, only limited to four instrument classes. Larger datasets and more instruments are costly and time-consuming in collecting data and training deep neural networks (DNNs). In this work, we propose a fast method to evaluate the separability of instruments in any dataset without training and tuning a DNN. This separability measure helps to select appropriate samples for the efficient training of neural networks. Based on the oracle principle with an ideal ratio mask, our approach is an excellent proxy to estimate the separation performances of state-of-the-art deep learning approaches such as TasNet or Open-Unmix. Our results contribute to revealing two essential points for audio source separation: 1) the ideal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Music Technology and Sound Studies