Remixing-based Unsupervised Source Separation from Scratch
Kohei Saijo, Tetsuji Ogawa

TL;DR
This paper introduces a novel unsupervised method for training source separation models from scratch using remixing techniques, outperforming existing approaches like mixture invariant training.
Contribution
It demonstrates that RemixIT and Self-Remixing can be effectively used to train separation models from scratch without pre-trained teachers, with added stabilization techniques.
Findings
Outperforms mixture invariant training in experiments
Enables training separation models from scratch
Introduces a simple remixing method for stable training
Abstract
We propose an unsupervised approach for training separation models from scratch using RemixIT and Self-Remixing, which are recently proposed self-supervised learning methods for refining pre-trained models. They first separate mixtures with a teacher model and create pseudo-mixtures by shuffling and remixing the separated signals. A student model is then trained to separate the pseudo-mixtures using either the teacher's outputs or the initial mixtures as supervision. To refine the teacher's outputs, the teacher's weights are updated with the student's weights. While these methods originally assumed that the teacher is pre-trained, we show that they are capable of training models from scratch. We also introduce a simple remixing method to stabilize training. Experimental results demonstrate that the proposed approach outperforms mixture invariant training, which is currently the only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Geophysical Methods and Applications
