Exploring The Landscape of Distributional Robustness for Question   Answering Models

Anas Awadalla; Mitchell Wortsman; Gabriel Ilharco; Sewon Min; Ian; Magnusson; Hannaneh Hajishirzi; Ludwig Schmidt

arXiv:2210.12517·cs.CL·October 25, 2022

Exploring The Landscape of Distributional Robustness for Question Answering Models

Anas Awadalla, Mitchell Wortsman, Gabriel Ilharco, Sewon Min, Ian, Magnusson, Hannaneh Hajishirzi, Ludwig Schmidt

PDF

Open Access

TL;DR

This paper provides a comprehensive empirical analysis of distributional robustness in question answering models, revealing key insights about model variations, training methods, and robustness across diverse datasets.

Contribution

It offers the first large-scale evaluation of over 350 models across multiple datasets, highlighting factors influencing robustness and providing publicly available evaluation resources.

Findings

01

Zero-shot and in-context learning are more robust than fine-tuned models.

02

Few-shot prompt fine-tuning outperforms span prediction in robustness.

03

Parameter-efficient and robustness-focused training methods do not significantly improve robustness.

Abstract

We conduct a large empirical evaluation to investigate the landscape of distributional robustness in question answering. Our investigation spans over 350 models and 16 question answering datasets, including a diverse set of architectures, model sizes, and adaptation methods (e.g., fine-tuning, adapter tuning, in-context learning, etc.). We find that, in many cases, model variations do not affect robustness and in-distribution performance alone determines out-of-distribution performance. Moreover, our findings indicate that i) zero-shot and in-context learning methods are more robust to distribution shifts than fully fine-tuned models; ii) few-shot prompt fine-tuned models exhibit better robustness than few-shot fine-tuned span prediction models; iii) parameter-efficient and robustness enhancing training methods provide no significant robustness improvements. In addition, we publicly…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Expert finding and Q&A systems · Domain Adaptation and Few-Shot Learning

MethodsAdapter