Are Sample-Efficient NLP Models More Robust?

Nelson F. Liu; Ananya Kumar; Percy Liang; Robin Jia

arXiv:2210.06456·cs.CL·June 1, 2023

Are Sample-Efficient NLP Models More Robust?

Nelson F. Liu, Ananya Kumar, Percy Liang, Robin Jia

PDF

Open Access

TL;DR

This study investigates whether sample-efficient NLP models are inherently more robust out-of-distribution, finding that the relationship varies across tasks and datasets, and that universal robustness improvements are unlikely.

Contribution

It provides a comprehensive empirical analysis across multiple NLP tasks and interventions, revealing the complex and dataset-dependent relationship between sample efficiency and robustness.

Findings

01

Sample efficiency correlates with robustness only in some cases.

02

Lower sample efficiency models can sometimes be more robust.

03

Universal methods for improving robustness via sample efficiency are unlikely.

Abstract

Recent results in image classification and extractive question answering have observed that pre-trained models trained on less in-distribution data have better out-of-distribution performance. However, it is unclear how broadly these trends hold. We conduct a large empirical study across three tasks, three broadly-applicable modeling interventions (increasing model size, using a different adaptation method, and pre-training on more data), and 14 diverse datasets to investigate the relationship between sample efficiency (amount of data needed to reach a given ID accuracy) and robustness (how models fare on OOD evaluation). We find that higher sample efficiency is only correlated with better average OOD robustness on some modeling interventions and tasks, but not others. On individual datasets, models with lower sample efficiency can even be more robust. These results suggest that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Cosine Annealing · {Dispute@FaQ-s}How to file a dispute with Expedia? · Multi-Head Attention · Softmax · Linear Warmup With Cosine Annealing · Attention Dropout