AdaPTwin: Low-Cost Adaptive Compression of Product Twins in Transformers
Emil Biju, Anirudh Sriram, Mert Pilanci

TL;DR
AdaPTwin is a low-cost, adaptive compression method for transformer models that efficiently reduces size while maintaining performance, requiring minimal data and time for fine-tuning, suitable for resource-limited speech recognition applications.
Contribution
The paper introduces AdaPTwin, a novel low-rank adaptive compression technique that selectively compresses transformer attention weights with minimal data and time, enhancing efficiency for speech recognition models.
Findings
Compressed models by up to 45% with less than 2% WER increase.
Requires only 8 hours of speech data for fine-tuning.
Fine-tuning process takes under 20 minutes.
Abstract
While large transformer-based models have exhibited remarkable performance in speaker-independent speech recognition, their large size and computational requirements make them expensive or impractical to use in resource-constrained settings. In this work, we propose a low-rank adaptive compression technique called AdaPTwin that jointly compresses product-dependent pairs of weight matrices in the transformer attention layer. Our approach can prioritize the compressed model's performance on a specific speaker while maintaining generalizability to new speakers and acoustic conditions. Notably, our technique requires only 8 hours of speech data for fine-tuning, which can be accomplished in under 20 minutes, making it highly cost-effective compared to other compression methods. We demonstrate the efficacy of our approach by compressing the Whisper and Distil-Whisper models by up to 45% while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBusiness Process Modeling and Analysis · Product Development and Customization · Web Data Mining and Analysis
