Addressing Distribution Shift at Test Time in Pre-trained Language Models
Ayush Singh, John E. Ortega

TL;DR
This paper introduces MEMO-CL, a test-time adaptation method for pre-trained language models that uses unsupervised data augmentation to improve performance under distribution shift without requiring labels or additional data.
Contribution
The paper proposes MEMO-CL, a novel unsupervised, domain-agnostic test-time adaptation technique leveraging data augmentation to enhance PLM robustness against distribution shifts.
Findings
Achieves 3% performance improvement over existing baselines.
Operates efficiently on a batch of augmented samples from a single test observation.
Requires no additional data or labels for adaptation.
Abstract
State-of-the-art pre-trained language models (PLMs) outperform other models when applied to the majority of language processing tasks. However, PLMs have been found to degrade in performance under distribution shift, a phenomenon that occurs when data at test-time does not come from the same distribution as the source training set. Equally as challenging is the task of obtaining labels in real-time due to issues like long-labeling feedback loops. The lack of adequate methods that address the aforementioned challenges constitutes the need for approaches that continuously adapt the PLM to a distinct distribution. Unsupervised domain adaptation adapts a source model to an unseen as well as unlabeled target domain. While some techniques such as data augmentation can adapt models in several scenarios, they have only been sparsely studied for addressing the distribution shift problem. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsTest
