To Adapt or to Annotate: Challenges and Interventions for Domain   Adaptation in Open-Domain Question Answering

Dheeru Dua; Emma Strubell; Sameer Singh; Pat Verga

arXiv:2212.10381·cs.CL·December 21, 2022

To Adapt or to Annotate: Challenges and Interventions for Domain Adaptation in Open-Domain Question Answering

Dheeru Dua, Emma Strubell, Sameer Singh, Pat Verga

PDF

Open Access

TL;DR

This paper investigates the robustness of open-domain question answering models under realistic domain shifts, revealing their limitations and proposing intervention techniques that significantly improve end-to-end performance.

Contribution

It introduces a challenging domain shift evaluation setting for ODQA, categorizes shift types, and proposes intervention methods to enhance model robustness and accuracy.

Findings

01

Models fail to generalize under realistic domain shifts.

02

High retrieval scores do not guarantee accurate answers.

03

Intervention techniques can improve answer F1 score by up to 24 points.

Abstract

Recent advances in open-domain question answering (ODQA) have demonstrated impressive accuracy on standard Wikipedia style benchmarks. However, it is less clear how robust these models are and how well they perform when applied to real-world applications in drastically different domains. While there has been some work investigating how well ODQA models perform when tested for out-of-domain (OOD) generalization, these studies have been conducted only under conservative shifts in data distribution and typically focus on a single component (ie. retrieval) rather than an end-to-end system. In response, we propose a more realistic and challenging domain shift evaluation setting and, through extensive experiments, study end-to-end model performance. We find that not only do models fail to generalize, but high retrieval scores often still yield poor answer prediction accuracy. We then…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques

Methodsfail