NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement

Wen Wang; Dongchao Yang; Qichen Ye; Bowen Cao; Yuexian Zou

arXiv:2309.01212·cs.SD·September 6, 2023

NADiffuSE: Noise-aware Diffusion-based Model for Speech Enhancement

Wen Wang, Dongchao Yang, Qichen Ye, Bowen Cao, Yuexian Zou

PDF

Open Access

TL;DR

NADiffuSE introduces a noise-aware diffusion model for speech enhancement that effectively handles non-Gaussian noise and reduces residual noise, outperforming previous diffusion-based methods.

Contribution

The paper proposes NADiffuSE, a novel noise-aware diffusion model that incorporates noise representation as global conditional information and employs an anchor-based inference algorithm.

Findings

01

Outperforms other diffusion-based speech enhancement models.

02

Effectively estimates non-Gaussian noise components.

03

Reduces residual noise and speech distortion.

Abstract

The goal of speech enhancement (SE) is to eliminate the background interference from the noisy speech signal. Generative models such as diffusion models (DM) have been applied to the task of SE because of better generalization in unseen noisy scenes. Technical routes for the DM-based SE methods can be summarized into three types: task-adapted diffusion process formulation, generator-plus-conditioner (GPC) structures and the multi-stage frameworks. We focus on the first two approaches, which are constructed under the GPC architecture and use the task-adapted diffusion process to better deal with the real noise. However, the performance of these SE models is limited by the following issues: (a) Non-Gaussian noise estimation in the task-adapted diffusion process. (b) Conditional domain bias caused by the weak conditioner design in the GPC structure. (c) Large amount of residual noise…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Phonetics and Phonology Research