On-demand compute reduction with stochastic wav2vec 2.0
Apoorv Vyas, Wei-Ning Hsu, Michael Auli, Alexei Baevski

TL;DR
This paper introduces stochastic compression techniques for wav2vec 2.0 models, enabling on-demand compute reduction during inference with minimal impact on accuracy, and demonstrates effective fine-tuning for specific configurations.
Contribution
It proposes stochastic compression methods, including variable squeeze factors and pooling mechanisms, for wav2vec 2.0, allowing flexible trade-offs between accuracy and computational efficiency.
Findings
Stochastic models achieve a smooth WER-inference time trade-off.
Marginal WER degradation compared to fixed models.
Fine-tuning recovers WER with significant computational savings.
Abstract
Squeeze and Efficient Wav2vec (SEW) is a recently proposed architecture that squeezes the input to the transformer encoder for compute efficient pre-training and inference with wav2vec 2.0 (W2V2) models. In this work, we propose stochastic compression for on-demand compute reduction for W2V2 models. As opposed to using a fixed squeeze factor, we sample it uniformly during training. We further introduce query and key-value pooling mechanisms that can be applied to each transformer layer for further compression. Our results for models pre-trained on 960h Librispeech dataset and fine-tuned on 10h of transcribed data show that using the same stochastic model, we get a smooth trade-off between word error rate (WER) and inference time with only marginal WER degradation compared to the W2V2 and SEW models trained for a specific setting. We further show that we can fine-tune the same…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Topic Modeling · Ferroelectric and Negative Capacitance Devices
