TL;DR
This paper introduces the GC3 design, combining group communication and context codec modules, to reduce model size and complexity in neural speech separation models while maintaining or improving performance.
Contribution
The paper presents a novel GC3 architecture that significantly decreases model size and complexity without sacrificing speech separation accuracy.
Findings
Achieves comparable or better performance with only 2.5% of the original model size.
Reduces model complexity by 17.6% across various architectures.
Demonstrates effectiveness on multiple speech separation benchmarks.
Abstract
Despite the recent progress on neural network architectures for speech separation, the balance between the model size, model complexity and model performance is still an important and challenging problem for the deployment of such models to low-resource platforms. In this paper, we propose two simple modules, group communication and context codec, that can be easily applied to a wide range of architectures to jointly decrease the model size and complexity without sacrificing the performance. A group communication module splits a high-dimensional feature into groups of low-dimensional features and captures the inter-group dependency. A separation module with a significantly smaller model size can then be shared by all the groups. A context codec module, containing a context encoder and a context decoder, is designed as a learnable downsampling and upsampling module to decrease the length…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
