Self-refining of Pseudo Labels for Music Source Separation with Noisy Labeled Data
Junghyun Koo, Yunkee Chae, Chang-Bin Jeon, Kyogu Lee

TL;DR
This paper presents a self-refining method for noisy labels in music source separation, significantly improving model performance despite label inaccuracies and outperforming traditional clean-label training.
Contribution
It introduces an automated self-refining technique for noisy-labeled datasets in music source separation, reducing label noise impact and enhancing model accuracy.
Findings
Self-refining reduces accuracy degradation to 1% in instrument recognition.
Refined noisy datasets enable MSS models to perform comparably to clean-labeled datasets.
Models trained on self-refined noisy data outperform those refined with clean-label classifiers.
Abstract
Music source separation (MSS) faces challenges due to the limited availability of correctly-labeled individual instrument tracks. With the push to acquire larger datasets to improve MSS performance, the inevitability of encountering mislabeled individual instrument tracks becomes a significant challenge to address. This paper introduces an automated technique for refining the labels in a partially mislabeled dataset. Our proposed self-refining technique, employed with a noisy-labeled dataset, results in only a 1% accuracy degradation in multi-label instrument recognition compared to a classifier trained on a clean-labeled dataset. The study demonstrates the importance of refining noisy-labeled data in MSS model training and shows that utilizing the refined dataset leads to comparable results derived from a clean-labeled dataset. Notably, upon only access to a noisy dataset, MSS models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
