DeepFilterNet: A Low Complexity Speech Enhancement Framework for   Full-Band Audio based on Deep Filtering

Hendrik Schr\"oter; Alberto N. Escalante-B.; Tobias Rosenkranz,; Andreas Maier

arXiv:2110.05588·eess.AS·February 2, 2022·1 cites

DeepFilterNet: A Low Complexity Speech Enhancement Framework for Full-Band Audio based on Deep Filtering

Hendrik Schr\"oter, Alberto N. Escalante-B., Tobias Rosenkranz,, Andreas Maier

PDF

Open Access 1 Repo 2 Models

TL;DR

DeepFilterNet is a low-complexity, two-stage speech enhancement framework that uses deep filtering and perceptually motivated spectral modeling to outperform complex mask methods and state-of-the-art models.

Contribution

The paper introduces DeepFilterNet, a novel two-stage deep filtering approach that enhances speech by modeling spectral envelope and periodic components with low complexity.

Findings

01

Outperforms complex mask-based methods across various frequency resolutions.

02

Demonstrates superior speech enhancement performance compared to existing state-of-the-art models.

03

Enforces network sparsity for low computational complexity.

Abstract

Complex-valued processing has brought deep learning-based speech enhancement and signal extraction to a new level. Typically, the process is based on a time-frequency (TF) mask which is applied to a noisy spectrogram, while complex masks (CM) are usually preferred over real-valued masks due to their ability to modify the phase. Recent work proposed to use a complex filter instead of a point-wise multiplication with a mask. This allows to incorporate information from previous and future time steps exploiting local correlations within each frequency band. In this work, we propose DeepFilterNet, a two stage speech enhancement framework utilizing deep filtering. First, we enhance the spectral envelope using ERB-scaled gains modeling the human frequency perception. The second stage employs deep filtering to enhance the periodic components of speech. Additionally to taking advantage of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rikorose/deepfilternet
pytorchOfficial

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Acoustic Wave Phenomena Research