A Recursive Network with Dynamic Attention for Monaural Speech Enhancement
Andong Li, Chengshi Zheng, Cunhang Fan, Renhua Peng, Xiaodong Li

TL;DR
This paper introduces a recursive neural network with dynamic attention for monaural speech enhancement, improving noise reduction by adaptively focusing on speech features and reusing the network across stages.
Contribution
It combines dynamic attention with recursive learning to enhance speech quality while reducing the number of trainable parameters.
Findings
Outperforms recent state-of-the-art models in PESQ scores.
Achieves better STOI scores indicating improved intelligibility.
Demonstrates effective noise reduction on TIMIT corpus.
Abstract
A person tends to generate dynamic attention towards speech under complicated environments. Based on this phenomenon, we propose a framework combining dynamic attention and recursive learning together for monaural speech enhancement. Apart from a major noise reduction network, we design a separated sub-network, which adaptively generates the attention distribution to control the information flow throughout the major network. To effectively decrease the number of trainable parameters, recursive learning is introduced, which means that the network is reused for multiple stages, where the intermediate output in each stage is correlated with a memory mechanism. As a result, a more flexible and better estimation can be obtained. We conduct experiments on TIMIT corpus. Experimental results show that the proposed architecture obtains consistently better performance than recent state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Hearing Loss and Rehabilitation
