Deep Remix: Remixing Musical Mixtures Using a Convolutional Deep Neural Network
Andrew J.R Simpson, Gerard Roma, Mark D. Plumbley

TL;DR
This paper introduces a convolutional deep neural network that enables effective re-mixing of musical mixtures by adjusting vocal levels with minimal distortion, focusing on re-mixing rather than complete source separation.
Contribution
The study presents a novel approach using a DNN to perform re-mixing directly on spectrograms, reducing the need for perfect source separation and enabling subtle vocal adjustments.
Findings
Small vocal gain changes cause minimal distortion
Method effectively re-mixes vocal levels in musical mixtures
Potential application in re-mixing existing audio tracks
Abstract
Audio source separation is a difficult machine learning problem and performance is measured by comparing extracted signals with the component source signals. However, if separation is motivated by the ultimate goal of re-mixing then complete separation is not necessary and hence separation difficulty and separation quality are dependent on the nature of the re-mix. Here, we use a convolutional deep neural network (DNN), trained to estimate 'ideal' binary masks for separating voice from music, to perform re-mixing of the vocal balance by operating directly on the individual magnitude components of the musical mixture spectrogram. Our results demonstrate that small changes in vocal gain may be applied with very little distortion to the ultimate re-mix. Our method may be useful for re-mixing existing mixes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Blind Source Separation Techniques
