Diffusion Gaussian Mixture Audio Denoise

Pu Wang; Junhui Li; Jialu Li; Liangdong Guo; Youshan Zhang

arXiv:2406.09154·cs.SD·June 14, 2024

Diffusion Gaussian Mixture Audio Denoise

Pu Wang, Junhui Li, Jialu Li, Liangdong Guo, Youshan Zhang

PDF

Open Access

TL;DR

This paper introduces DiffGMM, a novel audio denoising model that combines diffusion processes with Gaussian mixture models to better handle complex, real-world noise distributions, achieving state-of-the-art results.

Contribution

The paper proposes a diffusion-based denoising model using Gaussian mixture models to accurately estimate complex noise distributions in audio signals.

Findings

01

Achieves state-of-the-art denoising performance

02

Effectively models real-world noise distributions

03

Utilizes a 1D-U-Net for feature extraction

Abstract

Recent diffusion models have achieved promising performances in audio-denoising tasks. The unique property of the reverse process could recover clean signals. However, the distribution of real-world noises does not comply with a single Gaussian distribution and is even unknown. The sampling of Gaussian noise conditions limits its application scenarios. To overcome these challenges, we propose a DiffGMM model, a denoising model based on the diffusion and Gaussian mixture models. We employ the reverse process to estimate parameters for the Gaussian mixture model. Given a noisy audio signal, we first apply a 1D-U-Net to extract features and train linear layers to estimate parameters for the Gaussian mixture model, and we approximate the real noise distributions. The noisy signal is continuously subtracted from the estimated noise to output clean audio signals. Extensive experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing

MethodsDiffusion