Dynamic Slimmable Networks for Efficient Speech Separation
Mohamed Elminshawi, Srikanth Raj Chetupalli, Emanu\"el A. P. Habets

TL;DR
This paper introduces a dynamic slimmable network for speech separation that adaptively adjusts its computational complexity based on input characteristics, improving efficiency without sacrificing performance.
Contribution
It presents a novel DSN model combining slimmable networks with a gating module and a signal-dependent loss for adaptive speech separation.
Findings
Achieves better performance-efficiency trade-off than static networks.
Effective on both clean and noisy speech datasets.
Reduces unnecessary computation for simpler segments.
Abstract
Recent progress in speech separation has been largely driven by advances in deep neural networks, yet their high computational and memory requirements hinder deployment on resource-constrained devices. A significant inefficiency in conventional systems arises from using static network architectures that maintain constant computational complexity across all input segments, regardless of their characteristics. This approach is sub-optimal for simpler segments that do not require intensive processing, such as silence or non-overlapping speech. To address this limitation, we propose a dynamic slimmable network (DSN) for speech separation that adaptively adjusts its computational complexity based on the input signal. The DSN combines a slimmable network, which can operate at different network widths, with a lightweight gating module that dynamically determines the required width by analyzing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Voice and Speech Disorders
