Learning Neural Models for Natural Language Processing in the Face of Distributional Shift
Paul Michel

TL;DR
This paper investigates the impact of distributional shift on NLP models, proposes benchmarks and metrics to evaluate robustness, and develops methods for improving model adaptation and robustness in changing data environments.
Contribution
It characterizes types of distributional shift in NLP, introduces benchmarks and metrics, and proposes robust training and adaptation methods inspired by distributionally robust optimization and information geometry.
Findings
Robust models outperform standard models under distributional shift
New benchmarks effectively measure shift impact on NLP tasks
Gradient update rule reduces catastrophic forgetting during adaptation
Abstract
The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms
