End-to-End Training of Neural Retrievers for Open-Domain Question   Answering

Devendra Singh Sachan; Mostofa Patwary; Mohammad Shoeybi and; Neel Kant; Wei Ping; William L Hamilton; Bryan Catanzaro

arXiv:2101.00408·cs.CL·June 3, 2021

End-to-End Training of Neural Retrievers for Open-Domain Question Answering

Devendra Singh Sachan, Mostofa Patwary, Mohammad Shoeybi and, Neel Kant, Wei Ping, William L Hamilton, Bryan Catanzaro

PDF

2 Repos

TL;DR

This paper introduces an effective unsupervised pre-training method for neural retrievers in open-domain QA, combined with end-to-end supervised training, achieving state-of-the-art results on major datasets.

Contribution

It presents a novel combination of unsupervised pre-training and end-to-end supervised training for neural retrievers in open-domain QA, improving retrieval accuracy and answer extraction performance.

Findings

01

Achieved 84% top-20 retrieval accuracy on Natural Questions.

02

Outperformed recent models like DPR, REALM, and RAG in answer extraction.

03

Demonstrated scalability with larger models leading to consistent performance gains.

Abstract

Recent work on training neural retrievers for open-domain question answering (OpenQA) has employed both supervised and unsupervised approaches. However, it remains unclear how unsupervised and supervised methods can be used most effectively for neural retrievers. In this work, we systematically study retriever pre-training. We first propose an approach of unsupervised pre-training with the Inverse Cloze Task and masked salient spans, followed by supervised finetuning using question-context pairs. This approach leads to absolute gains of 2+ points over the previous best result in the top-20 retrieval accuracy on Natural Questions and TriviaQA datasets. We also explore two approaches for end-to-end supervised training of the reader and retriever components in OpenQA models. In the first approach, the reader considers each retrieved document separately while in the second approach, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Adam · Residual Connection · Dense Connections · Linear Warmup With Linear Decay · Weight Decay