AdaPTwin: Low-Cost Adaptive Compression of Product Twins in Transformers

Emil Biju; Anirudh Sriram; Mert Pilanci

arXiv:2406.08904·cs.LG·June 14, 2024

AdaPTwin: Low-Cost Adaptive Compression of Product Twins in Transformers

Emil Biju, Anirudh Sriram, Mert Pilanci

PDF

Open Access

TL;DR

AdaPTwin is a low-cost, adaptive compression method for transformer models that efficiently reduces size while maintaining performance, requiring minimal data and time for fine-tuning, suitable for resource-limited speech recognition applications.

Contribution

The paper introduces AdaPTwin, a novel low-rank adaptive compression technique that selectively compresses transformer attention weights with minimal data and time, enhancing efficiency for speech recognition models.

Findings

01

Compressed models by up to 45% with less than 2% WER increase.

02

Requires only 8 hours of speech data for fine-tuning.

03

Fine-tuning process takes under 20 minutes.

Abstract

While large transformer-based models have exhibited remarkable performance in speaker-independent speech recognition, their large size and computational requirements make them expensive or impractical to use in resource-constrained settings. In this work, we propose a low-rank adaptive compression technique called AdaPTwin that jointly compresses product-dependent pairs of weight matrices in the transformer attention layer. Our approach can prioritize the compressed model's performance on a specific speaker while maintaining generalizability to new speakers and acoustic conditions. Notably, our technique requires only 8 hours of speech data for fine-tuning, which can be accomplished in under 20 minutes, making it highly cost-effective compared to other compression methods. We demonstrate the efficacy of our approach by compressing the Whisper and Distil-Whisper models by up to 45% while…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBusiness Process Modeling and Analysis · Product Development and Customization · Web Data Mining and Analysis