RemixIT: Continual self-training of speech enhancement models via   bootstrapped remixing

Efthymios Tzinis; Yossi Adi; Vamsi Krishna Ithapu; Buye Xu; Paris; Smaragdis; Anurag Kumar

arXiv:2202.08862·cs.SD·August 30, 2022

RemixIT: Continual self-training of speech enhancement models via bootstrapped remixing

Efthymios Tzinis, Yossi Adi, Vamsi Krishna Ithapu, Buye Xu, Paris, Smaragdis, Anurag Kumar

PDF

2 Repos

TL;DR

RemixIT is a self-supervised speech enhancement method that uses bootstrapped remixing and iterative self-training to improve performance without relying on clean in-domain signals, effectively handling domain mismatch.

Contribution

It introduces a novel self-training scheme with remixing for speech enhancement that does not require clean target signals and can adapt across domains.

Findings

01

Outperforms prior speech enhancement methods in various datasets

02

Compatible with any separation model and domain adaptation tasks

03

Student models improve despite degraded pseudo-targets

Abstract

We present RemixIT, a simple yet effective self-supervised method for training speech enhancement without the need of a single isolated in-domain speech nor a noise waveform. Our approach overcomes limitations of previous methods which make them dependent on clean in-domain target signals and thus, sensitive to any domain mismatch between train and test samples. RemixIT is based on a continuous self-training scheme in which a pre-trained teacher model on out-of-domain data infers estimated pseudo-target signals for in-domain mixtures. Then, by permuting the estimated clean and noise signals and remixing them together, we generate a new set of bootstrapped mixtures and corresponding pseudo-targets which are used to train the student network. Vice-versa, the teacher periodically refines its estimates using the updated parameters of the latest student models. Experimental results on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.