Addressing Distribution Shift at Test Time in Pre-trained Language   Models

Ayush Singh; John E. Ortega

arXiv:2212.02384·cs.CL·December 6, 2022

Addressing Distribution Shift at Test Time in Pre-trained Language Models

Ayush Singh, John E. Ortega

PDF

Open Access

TL;DR

This paper introduces MEMO-CL, a test-time adaptation method for pre-trained language models that uses unsupervised data augmentation to improve performance under distribution shift without requiring labels or additional data.

Contribution

The paper proposes MEMO-CL, a novel unsupervised, domain-agnostic test-time adaptation technique leveraging data augmentation to enhance PLM robustness against distribution shifts.

Findings

01

Achieves 3% performance improvement over existing baselines.

02

Operates efficiently on a batch of augmented samples from a single test observation.

03

Requires no additional data or labels for adaptation.

Abstract

State-of-the-art pre-trained language models (PLMs) outperform other models when applied to the majority of language processing tasks. However, PLMs have been found to degrade in performance under distribution shift, a phenomenon that occurs when data at test-time does not come from the same distribution as the source training set. Equally as challenging is the task of obtaining labels in real-time due to issues like long-labeling feedback loops. The lack of adequate methods that address the aforementioned challenges constitutes the need for approaches that continuously adapt the PLM to a distinct distribution. Unsupervised domain adaptation adapts a source model to an unseen as well as unlabeled target domain. While some techniques such as data augmentation can adapt models in several scenarios, they have only been sparsely studied for addressing the distribution shift problem. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsTest