Transfer Learning with Jukebox for Music Source Separation
W. Zai El Amri, O. Tautz, H. Ritter, A. Melnik

TL;DR
This paper adapts a pre-trained Jukebox model using transfer learning to efficiently perform music source separation from a single audio channel, achieving competitive results with less computational resources.
Contribution
It introduces a transfer learning approach leveraging Jukebox for source separation, offering a faster and resource-efficient alternative to existing methods.
Findings
Performance comparable to state-of-the-art methods
Reduced training time and computational resources
Open-source implementation available
Abstract
In this work, we demonstrate how a publicly available, pre-trained Jukebox model can be adapted for the problem of audio source separation from a single mixed audio channel. Our neural network architecture, which is using transfer learning, is quick to train and the results demonstrate performance comparable to other state-of-the-art approaches that require a lot more compute resources, training data, and time. We provide an open-source code implementation of our architecture (https://github.com/wzaielamri/unmix)
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
MethodsConvolution · Residual Connection · Layer Normalization · VQ-VAE · Dense Connections · Position-Wise Feed-Forward Layer · Dilated Convolution · Jukebox
