A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech   Enhancement

Jean-Marc Valin

arXiv:1709.08243·cs.SD·June 4, 2018

A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement

Jean-Marc Valin

PDF

2 Repos

TL;DR

This paper presents a hybrid DSP and deep learning method for real-time full-band speech enhancement that improves noise suppression quality while maintaining low computational complexity suitable for low-power devices.

Contribution

It introduces a novel hybrid approach combining neural network-based gain estimation with traditional pitch filtering for effective real-time speech enhancement.

Findings

01

Achieves higher speech quality than traditional spectral estimators.

02

Operates in real-time at 48 kHz on low-power processors.

03

Maintains low complexity suitable for practical applications.

Abstract

Despite noise suppression being a mature area in signal processing, it remains highly dependent on fine tuning of estimator algorithms and parameters. In this paper, we demonstrate a hybrid DSP/deep learning approach to noise suppression. A deep neural network with four hidden layers is used to estimate ideal critical band gains, while a more traditional pitch filter attenuates noise between pitch harmonics. The approach achieves significantly higher quality than a traditional minimum mean squared error spectral estimator, while keeping the complexity low enough for real-time operation at 48 kHz on a low-power processor.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.