TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation
Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu,, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu

TL;DR
This paper introduces a novel environment-aware speech dereverberation method using a temporal-contextual attention network that adaptively models correlations in fullband and subband information, improving performance in real-world reverberant environments.
Contribution
It proposes a new temporal attention mechanism for DNN-based dereverberation, jointly optimizing with RT60 estimation, and demonstrates superior results over previous methods.
Findings
Outperforms previous reverberation-time-aware DNNs.
Attention weights are physically consistent.
Shows promising dereverberation and recognition results on real data.
Abstract
In this paper, we exploit the effective way to leverage contextual information to improve the speech dereverberation performance in real-world reverberant environments. We propose a temporal-contextual attention approach on the deep neural network (DNN) for environment-aware speech dereverberation, which can adaptively attend to the contextual information. More specifically, a FullBand based Temporal Attention approach (FTA) is proposed, which models the correlations between the fullband information of the context frames. In addition, considering the difference between the attenuation of high frequency bands and low frequency bands (high frequency bands attenuate faster than low frequency bands) in the room impulse response (RIR), we also propose a SubBand based Temporal Attention approach (STA). In order to guide the network to be more aware of the reverberant environments, we jointly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Music and Audio Processing
MethodsAttentive Walk-Aggregating Graph Neural Network
