Convexity-based Pruning of Speech Representation Models
Teresa Dorszewski, Lenka T\v{e}tkov\'a, Lars Kai Hansen

TL;DR
This paper introduces a convexity-based layer pruning method for speech transformer models, significantly reducing computational costs without performance loss, thus enhancing efficiency for real-world applications.
Contribution
It proposes a novel convexity criterion for layer pruning in speech models, demonstrating effective reduction in model complexity while maintaining or improving performance.
Findings
Massive reduction in computational effort achieved.
No loss or improvement in model performance.
Convexity criterion effectively guides pruning decisions.
Abstract
Speech representation models based on the transformer architecture and trained by self-supervised learning have shown great promise for solving tasks such as speech and speaker recognition, keyword spotting, emotion detection, and more. Typically, it is found that larger models lead to better performance. However, the significant computational effort involved in such large transformer systems is a challenge for embedded and real-world applications. Recent work has shown that there is significant redundancy in the transformer models for NLP and massive layer pruning is feasible (Sajjad et al., 2023). Here, we investigate layer pruning in audio models. We base the pruning decision on a convexity criterion. Convexity of classification regions has recently been proposed as an indicator of subsequent fine-tuning performance in a range of application domains, including NLP and audio. In…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning · Balanced Selection
