Efficient Sample-Specific Encoder Perturbations
Yassir Fathullah, Mark J. F. Gales

TL;DR
This paper introduces a lightweight, sample-specific perturbation method for encoder-decoder models that improves task-specific performance like machine translation and speech recognition without retraining the entire model.
Contribution
A novel inference-efficient approach using proxy networks to perturb encoder outputs for enhanced decoding in foundation models.
Findings
Consistent improvement in COMET scores for machine translation.
Reduced WER in speech recognition tasks.
Proxies are robust across different data domains.
Abstract
Encoder-decoder foundation models have displayed state-of-the-art performance on a range of autoregressive sequence tasks. This paper proposes a simple and lightweight modification to such systems to control the behaviour according to a specific attribute of interest. This paper proposes a novel inference-efficient approach to modifying the behaviour of an encoder-decoder system according to a specific attribute of interest. Specifically, we show that a small proxy network can be used to find a sample-by-sample perturbation of the encoder output of a frozen foundation model to trigger the decoder to generate improved decodings. This work explores a specific realization of this framework focused on improving the COMET performance of Flan-T5 on Machine Translation and the WER of Whisper foundation models on Speech Recognition. Results display consistent improvements in performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Data Compression Techniques
MethodsFlan-T5
