Dereverberation of Autoregressive Envelopes for Far-field Speech   Recognition

Anurenjan Purushothaman; Anirudh Sreeram; Rohit Kumar; Sriram; Ganapathy

arXiv:2108.05520·eess.AS·August 21, 2021

Dereverberation of Autoregressive Envelopes for Far-field Speech Recognition

Anurenjan Purushothaman, Anirudh Sreeram, Rohit Kumar, Sriram, Ganapathy

PDF

TL;DR

This paper introduces a neural dereverberation model that enhances far-field speech recognition by estimating and suppressing reverberant envelope components, leading to significant accuracy improvements.

Contribution

It proposes a novel neural model for envelope dereverberation using FDLP-derived sub-band envelopes, integrated into a joint ASR training pipeline for improved performance.

Findings

01

Significant relative improvements of 10-24% over baseline ASR systems.

02

Effective joint learning of dereverberation and acoustic modeling.

03

Detailed analysis of hyper-parameters and cost functions for envelope dereverberation.

Abstract

The task of speech recognition in far-field environments is adversely affected by the reverberant artifacts that elicit as the temporal smearing of the sub-band envelopes. In this paper, we develop a neural model for speech dereverberation using the long-term sub-band envelopes of speech. The sub-band envelopes are derived using frequency domain linear prediction (FDLP) which performs an autoregressive estimation of the Hilbert envelopes. The neural dereverberation model estimates the envelope gain which when applied to reverberant signals suppresses the late reflection components in the far-field signal. The dereverberated envelopes are used for feature extraction in speech recognition. Further, the sequence of steps involved in envelope dereverberation, feature extraction and acoustic modeling for ASR can be implemented as a single neural processing pipeline which allows the joint…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.