Diffusion-based speech enhancement with a weighted generative-supervised learning loss
Jean-Eudes Ayilo (MULTISPEECH), Mostafa Sadeghi (MULTISPEECH), Romain, Serizel (MULTISPEECH)

TL;DR
This paper introduces a diffusion-based speech enhancement method that combines a generative model with a weighted supervised loss to improve the quality of enhanced speech, showing promising experimental results.
Contribution
The paper proposes augmenting diffusion-based speech enhancement with a weighted MSE loss to better incorporate ground-truth speech information during training.
Findings
Improved speech quality over baseline diffusion models
Effective integration of supervised loss enhances enhancement performance
Experimental results validate the proposed method's effectiveness
Abstract
Diffusion-based generative models have recently gained attention in speech enhancement (SE), providing an alternative to conventional supervised methods. These models transform clean speech training samples into Gaussian noise centered at noisy speech, and subsequently learn a parameterized model to reverse this process, conditionally on noisy speech. Unlike supervised methods, generative-based SE approaches usually rely solely on an unsupervised loss, which may result in less efficient incorporation of conditioned noisy speech. To address this issue, we propose augmenting the original diffusion training objective with a mean squared error (MSE) loss, measuring the discrepancy between estimated enhanced speech and ground-truth clean speech at each reverse process iteration. Experimental results demonstrate the effectiveness of our proposed methodology.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Infant Health and Development
MethodsDiffusion
