Learning Neural Models for Natural Language Processing in the Face of   Distributional Shift

Paul Michel

arXiv:2109.01558·cs.CL·September 6, 2021·1 cites

Learning Neural Models for Natural Language Processing in the Face of Distributional Shift

Paul Michel

PDF

Open Access 1 Repo

TL;DR

This paper investigates the impact of distributional shift on NLP models, proposes benchmarks and metrics to evaluate robustness, and develops methods for improving model adaptation and robustness in changing data environments.

Contribution

It characterizes types of distributional shift in NLP, introduces benchmarks and metrics, and proposes robust training and adaptation methods inspired by distributionally robust optimization and information geometry.

Findings

01

Robust models outperform standard models under distributional shift

02

New benchmarks effectively measure shift impact on NLP tasks

03

Gradient update rule reduces catastrophic forgetting during adaptation

Abstract

The dominating NLP paradigm of training a strong neural predictor to perform one task on a specific dataset has led to state-of-the-art performance in a variety of applications (eg. sentiment classification, span-prediction based question answering or machine translation). However, it builds upon the assumption that the data distribution is stationary, ie. that the data is sampled from a fixed distribution both at training and test time. This way of training is inconsistent with how we as humans are able to learn from and operate within a constantly changing stream of information. Moreover, it is ill-adapted to real-world use cases where the data distribution is expected to shift over the course of a model's lifetime. The first goal of this thesis is to characterize the different forms this shift can take in the context of natural language processing, and propose benchmarks and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pmichel31415/P-DRO
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Machine Learning and Algorithms