TeCANet: Temporal-Contextual Attention Network for Environment-Aware   Speech Dereverberation

Helin Wang; Bo Wu; Lianwu Chen; Meng Yu; Jianwei Yu; Yong Xu,; Shi-Xiong Zhang; Chao Weng; Dan Su; Dong Yu

arXiv:2103.16849·eess.AS·August 27, 2021·1 cites

TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation

Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu,, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu

PDF

Open Access

TL;DR

This paper introduces a novel environment-aware speech dereverberation method using a temporal-contextual attention network that adaptively models correlations in fullband and subband information, improving performance in real-world reverberant environments.

Contribution

It proposes a new temporal attention mechanism for DNN-based dereverberation, jointly optimizing with RT60 estimation, and demonstrates superior results over previous methods.

Findings

01

Outperforms previous reverberation-time-aware DNNs.

02

Attention weights are physically consistent.

03

Shows promising dereverberation and recognition results on real data.

Abstract

In this paper, we exploit the effective way to leverage contextual information to improve the speech dereverberation performance in real-world reverberant environments. We propose a temporal-contextual attention approach on the deep neural network (DNN) for environment-aware speech dereverberation, which can adaptively attend to the contextual information. More specifically, a FullBand based Temporal Attention approach (FTA) is proposed, which models the correlations between the fullband information of the context frames. In addition, considering the difference between the attenuation of high frequency bands and low frequency bands (high frequency bands attenuate faster than low frequency bands) in the room impulse response (RIR), we also propose a SubBand based Temporal Attention approach (STA). In order to guide the network to be more aware of the reverberant environments, we jointly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Music and Audio Processing

MethodsAttentive Walk-Aggregating Graph Neural Network