Self-Supervised Learning based Monaural Speech Enhancement with Complex-Cycle-Consistent
Yi Li, Yang Sun, Syed Mohsen Naqvi

TL;DR
This paper introduces a phase-aware self-supervised learning method for monaural speech enhancement that leverages complex-cycle-consistent mechanisms and multi-resolution spectral features to improve performance over existing approaches.
Contribution
It proposes a novel phase-aware SSL framework with a complex-cycle-consistent loss and multi-resolution spectral features, enhancing speech enhancement without requiring paired clean and noisy data.
Findings
Outperforms state-of-the-art methods on NOISEX and DAPS datasets.
Effectively utilizes unpaired clean speech and mixture data.
Demonstrates significant improvements in speech quality metrics.
Abstract
Recently, self-supervised learning (SSL) techniques have been introduced to solve the monaural speech enhancement problem. Due to the lack of using clean phase information, the enhancement performance is limited in most SSL methods. Therefore, in this paper, we propose a phase-aware self-supervised learning based monaural speech enhancement method. The latent representations of both amplitude and phase are studied in two decoders of the foundation autoencoder (FAE) with only a limited set of clean speech signals independently. Then, the downstream autoencoder (DAE) learns a shared latent space between the clean speech and mixture representations with a large number of unseen mixtures. A complex-cycle-consistent (CCC) mechanism is proposed to minimize the reconstruction loss between the amplitude and phase domains. Besides, it is noticed that if the speech features are extracted as the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Infant Health and Development
