Inter-channel Conv-TasNet for multichannel speech enhancement
Dongheon Lee, Seongrae Kim, and Jung-Woo Choi

TL;DR
This paper introduces an advanced multichannel speech enhancement network based on Conv-TasNet, which effectively exploits inter-channel relationships and spatial information to significantly improve speech quality and noise suppression.
Contribution
It extends Conv-TasNet into a multichannel framework that fully utilizes inter-channel relationships and spatial information, outperforming existing models with fewer parameters.
Findings
Outperforms state-of-the-art multichannel neural networks
Uses fewer parameters while achieving better enhancement
Significant improvements in SDR, PESQ, and STOI on CHiME-3
Abstract
Speech enhancement in multichannel settings has been realized by utilizing the spatial information embedded in multiple microphone signals. Moreover, deep neural networks (DNNs) have been recently advanced in this field; however, studies on the efficient multichannel network structure fully exploiting spatial information and inter-channel relationships is still in its early stages. In this study, we propose an end-to-end time-domain speech enhancement network that can facilitate the use of inter-channel relationships at individual layers of a DNN. The proposed technique is based on a fully convolutional time-domain audio separation network (Conv-TasNet), originally developed for speech separation tasks. We extend Conv-TasNet into several forms that can handle multichannel input signals and learn inter-channel relationships. To this end, we modify the encoder-mask-decoder structures of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
MethodsConvolution
