A Recursive Network with Dynamic Attention for Monaural Speech   Enhancement

Andong Li; Chengshi Zheng; Cunhang Fan; Renhua Peng; Xiaodong Li

arXiv:2003.12973·cs.SD·April 2, 2020·6 cites

A Recursive Network with Dynamic Attention for Monaural Speech Enhancement

Andong Li, Chengshi Zheng, Cunhang Fan, Renhua Peng, Xiaodong Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces a recursive neural network with dynamic attention for monaural speech enhancement, improving noise reduction by adaptively focusing on speech features and reusing the network across stages.

Contribution

It combines dynamic attention with recursive learning to enhance speech quality while reducing the number of trainable parameters.

Findings

01

Outperforms recent state-of-the-art models in PESQ scores.

02

Achieves better STOI scores indicating improved intelligibility.

03

Demonstrates effective noise reduction on TIMIT corpus.

Abstract

A person tends to generate dynamic attention towards speech under complicated environments. Based on this phenomenon, we propose a framework combining dynamic attention and recursive learning together for monaural speech enhancement. Apart from a major noise reduction network, we design a separated sub-network, which adaptively generates the attention distribution to control the information flow throughout the major network. To effectively decrease the number of trainable parameters, recursive learning is introduced, which means that the network is reused for multiple stages, where the intermediate output in each stage is correlated with a memory mechanism. As a result, a more flexible and better estimation can be obtained. We conduct experiments on TIMIT corpus. Experimental results show that the proposed architecture obtains consistently better performance than recent state-of-the-art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Andong-Li-speech/DARCN
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Adaptive Filtering Techniques · Hearing Loss and Rehabilitation