Scalable Speech Enhancement with Dynamic Channel Pruning
Riccardo Miccini, Clement Laroche, Tobias Piechowiak, Luca Pezzarossa

TL;DR
This paper introduces Dynamic Channel Pruning for speech enhancement, enabling neural networks to adaptively reduce computation at runtime, significantly saving resources with minimal quality loss, thus facilitating deployment on embedded devices.
Contribution
It presents the first application of Dynamic Channel Pruning in the audio domain, specifically for speech enhancement, allowing adaptive computational efficiency.
Findings
29.6% MACs reduction at 25% channels used
Only 0.75% PESQ drop with pruning
Enables deployment of larger models on resource-constrained devices
Abstract
Speech Enhancement (SE) is essential for improving productivity in remote collaborative environments. Although deep learning models are highly effective at SE, their computational demands make them impractical for embedded systems. Furthermore, acoustic conditions can change significantly in terms of difficulty, whereas neural networks are usually static with regard to the amount of computation performed. To this end, we introduce Dynamic Channel Pruning to the audio domain for the first time and apply it to a custom convolutional architecture for SE. Our approach works by identifying unnecessary convolutional channels at runtime and saving computational resources by not computing the activations for these channels and retrieving their filters. When trained to only use 25% of channels, we save 29.6% of MACs while only causing a 0.75% drop in PESQ. Thus, DynCP offers a promising path…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Infant Health and Development
MethodsPruning
