Speech Enhancement with Fullband-Subband Cross-Attention Network
Jun Chen, Wei Rao, Zilin Wang, Zhiyong Wu, Yannan Wang, Tao Yu,, Shidong Shang, Helen Meng

TL;DR
This paper introduces FS-CANet, a novel speech enhancement model that employs fullband-subband cross-attention to better fuse global and local information, outperforming existing methods on standard datasets.
Contribution
The paper proposes a fullband-subband cross-attention module and integrates it into FullSubNet, enhancing interaction between fullband and subband features and reducing model size with TCN blocks.
Findings
FS-CANet outperforms state-of-the-art speech enhancement models.
The cross-attention mechanism effectively fuses global and local information.
Model size is reduced while maintaining high performance.
Abstract
FullSubNet has shown its promising performance on speech enhancement by utilizing both fullband and subband information. However, the relationship between fullband and subband in FullSubNet is achieved by simply concatenating the output of fullband model and subband units. It only supplements the subband units with a small quantity of global information and has not considered the interaction between fullband and subband. This paper proposes a fullband-subband cross-attention (FSCA) module to interactively fuse the global and local information and applies it to FullSubNet. This new framework is called as FS-CANet. Moreover, different from FullSubNet, the proposed FS-CANet optimize the fullband extractor by temporal convolutional network (TCN) blocks to further reduce the model size. Experimental results on DNS Challenge - Interspeech 2021 dataset show that the proposed FS-CANet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hand Gesture Recognition Systems · Indoor and Outdoor Localization Technologies
