Speech Denoising by Accumulating Per-Frequency Modeling Fluctuations

Michael Michelashvili; Lior Wolf

arXiv:1904.07612·cs.SD·June 11, 2020·6 cites

Speech Denoising by Accumulating Per-Frequency Modeling Fluctuations

Michael Michelashvili, Lior Wolf

PDF

Open Access 1 Repo

TL;DR

This paper introduces an unsupervised audio denoising method that combines time and time-frequency domain processing, leveraging neural network fitting scores to effectively separate clean speech from noise in a single clip.

Contribution

The novel approach integrates time and frequency domain analysis with neural network fitting scores, enabling unsupervised denoising tailored to individual audio clips.

Findings

01

Outperforms existing denoising methods in experiments

02

Effective in unsupervised, clip-specific denoising scenarios

03

Code and samples publicly available for reproducibility

Abstract

We present a method for audio denoising that combines processing done in both the time domain and the time-frequency domain. Given a noisy audio clip, the method trains a deep neural network to fit this signal. Since the fitting is only partly successful and is able to better capture the underlying clean signal than the noise, the output of the network helps to disentangle the clean audio from the rest of the signal. This is done by accumulating a fitting score per time-frequency bin and applying the time-frequency domain filtering based on the obtained scores. The method is completely unsupervised and only trains on the specific audio clip that is being denoised. Our experiments demonstrate favorable performance in comparison to the literature methods. Our code and samples are available at github.com/mosheman5/DNP and as supplementary. Index Terms: Audio denoising; Unsupervised learning

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mosheman5/DNP
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis