All for One and One for All: Improving Music Separation by Bridging   Networks

Ryosuke Sawata; Stefan Uhlich; Shusuke Takahashi; Yuki Mitsufuji

arXiv:2010.04228·eess.AS·May 12, 2021

All for One and One for All: Improving Music Separation by Bridging Networks

Ryosuke Sawata, Stefan Uhlich, Shusuke Takahashi, Yuki Mitsufuji

PDF

5 Repos

TL;DR

This paper introduces multi-domain loss and combination schemes to enhance music separation with deep neural networks, leveraging frequency-time domain representations and instrument relationships, leading to improved performance of the Open-Unmix system.

Contribution

It proposes novel loss functions and network modifications that improve music separation accuracy without altering inference, applicable to existing DNN models.

Findings

01

Performance of Open-Unmix improved with proposed schemes

02

Multi-domain loss enhances signal representation

03

Joint instrument consideration boosts separation quality

Abstract

This paper proposes several improvements for music separation with deep neural networks (DNNs), namely a multi-domain loss (MDL) and two combination schemes. First, by using MDL we take advantage of the frequency and time domain representation of audio signals. Next, we utilize the relationship among instruments by jointly considering them. We do this on the one hand by modifying the network architecture and introducing a CrossNet structure. On the other hand, we consider combinations of instrument estimates by using a new combination loss (CL). MDL and CL can easily be applied to many existing DNN-based separation methods as they are merely loss functions which are only used during training and which do not affect the inference step. Experimental results show that the performance of Open-Unmix (UMX), a well-known and state-of-the-art open source library for music separation, can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsMinimum Description Length · Sigmoid Activation · Tanh Activation · Long Short-Term Memory · Bidirectional LSTM