Self-Remixing: Unsupervised Speech Separation via Separation and   Remixing

Kohei Saijo; Tetsuji Ogawa

arXiv:2211.10194·eess.AS·September 4, 2023·1 cites

Self-Remixing: Unsupervised Speech Separation via Separation and Remixing

Kohei Saijo, Tetsuji Ogawa

PDF

Open Access

TL;DR

Self-Remixing introduces a self-supervised speech separation approach that refines pre-trained models through iterative separation and remixing, improving performance in unsupervised and semi-supervised settings.

Contribution

It proposes a novel self-supervised method with shuffler and solver modules that grow together, enhancing speech separation without labeled data.

Findings

01

Outperforms existing remixing-based self-supervised methods

02

Achieves better results with less training cost

03

Effective in semi-supervised domain adaptation

Abstract

We present Self-Remixing, a novel self-supervised speech separation method, which refines a pre-trained separation model in an unsupervised manner. The proposed method consists of a shuffler module and a solver module, and they grow together through separation and remixing processes. Specifically, the shuffler first separates observed mixtures and makes pseudo-mixtures by shuffling and remixing the separated signals. The solver then separates the pseudo-mixtures and remixes the separated signals back to the observed mixtures. The solver is trained using the observed mixtures as supervision, while the shuffler's weights are updated by taking the moving average with the solver's, generating the pseudo-mixtures with fewer distortions. Our experiments demonstrate that Self-Remixing gives better performance over existing remixing-based self-supervised methods with the same or less training…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Phonetics and Phonology Research