Tradeoffs in Sentence Selection Techniques for Open-Domain Question   Answering

Shih-Ting Lin; Greg Durrett

arXiv:2009.09120·cs.CL·September 22, 2020·1 cites

Tradeoffs in Sentence Selection Techniques for Open-Domain Question Answering

Shih-Ting Lin, Greg Durrett

PDF

Open Access

TL;DR

This paper investigates sentence selection techniques in open-domain QA, comparing QA-based and retrieval-based models, and introduces a hybrid ensemble to optimize speed and accuracy across datasets.

Contribution

It systematically analyzes trade-offs between different sentence selection methods and proposes a hybrid ensemble approach for improved performance and efficiency.

Findings

01

Retrieval-based models are faster than QA-based models.

02

Lightweight QA models perform well in sentence selection.

03

Ensemble methods generalize effectively across domains.

Abstract

Current methods in open-domain question answering (QA) usually employ a pipeline of first retrieving relevant documents, then applying strong reading comprehension (RC) models to that retrieved text. However, modern RC models are complex and expensive to run, so techniques to prune the space of retrieved text are critical to allow this approach to scale. In this paper, we focus on approaches which apply an intermediate sentence selection step to address this issue, and investigate the best practices for this approach. We describe two groups of models for sentence selection: QA-based approaches, which run a full-fledged QA system to identify answer candidates, and retrieval-based models, which find parts of each passage specifically related to each question. We examine trade-offs between processing speed and task performance in these two approaches, and demonstrate an ensemble module…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications