Embedding Recurrent Layers with Dual-Path Strategy in a Variant of Convolutional Network for Speaker-Independent Speech Separation
Xue Yang, Changchun Bao

TL;DR
This paper introduces a novel neural network architecture combining RNNs and a variant of convolutional networks with a dual-path strategy, achieving effective speaker-independent speech separation while balancing performance and computational efficiency.
Contribution
It proposes embedding RNNs into a convolutional network variant using a dual-path strategy, enabling better local and global feature learning for speech separation.
Findings
Effective separation performance on various datasets
Achieves a good balance between accuracy and computational efficiency
Gradual separation at multiple scales improves results
Abstract
Speaker-independent speech separation has achieved remarkable performance in recent years with the development of deep neural network (DNN). Various network architectures, from traditional convolutional neural network (CNN) and recurrent neural network (RNN) to advanced transformer, have been designed sophistically to improve separation performance. However, the state-of-the-art models usually suffer from several flaws related to the computation, such as large model size, huge memory consumption and computational complexity. To find the balance between the performance and computational efficiency and to further explore the modeling ability of traditional network structure, we combine RNN and a newly proposed variant of convolutional network to cope with speech separation problem. By embedding two RNNs into basic block of this variant with the help of dual-path strategy, the proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
