Deep Convolutional Neural Network-based Inverse Filtering Approach for   Speech De-reverberation

Hanwook Chung; Vikrant Singh Tomar; Benoit Champagne

arXiv:2010.07895·cs.SD·October 16, 2020

Deep Convolutional Neural Network-based Inverse Filtering Approach for Speech De-reverberation

Hanwook Chung, Vikrant Singh Tomar, Benoit Champagne

PDF

TL;DR

This paper presents a deep CNN-based inverse filtering method for speech de-reverberation that effectively handles long-room impulse responses, outperforming existing benchmarks in realistic reverberant environments.

Contribution

The paper introduces a spectral-domain inverse filtering approach using a U-net CNN architecture to directly estimate inverse filters for reverberant speech de-reverberation.

Findings

01

Outperforms benchmark algorithms in various reverberation conditions

02

Effective in handling long-room impulse responses

03

Uses a U-net architecture for inverse filter estimation

Abstract

In this paper, we introduce a spectral-domain inverse filtering approach for single-channel speech de-reverberation using deep convolutional neural network (CNN). The main goal is to better handle realistic reverberant conditions where the room impulse response (RIR) filter is longer than the short-time Fourier transform (STFT) analysis window. To this end, we consider the convolutive transfer function (CTF) model for the reverberant speech signal. In the proposed framework, the CNN architecture is trained to directly estimate the inverse filter of the CTF model. Among various choices for the CNN structure, we consider the U-net which consists of a fully-convolutional auto-encoder network with skip-connections. Experimental results show that the proposed method provides better de-reverberation performance than the prevalent benchmark algorithms under various reverberation conditions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.