TL;DR
This paper introduces an effective unsupervised pre-training method for neural retrievers in open-domain QA, combined with end-to-end supervised training, achieving state-of-the-art results on major datasets.
Contribution
It presents a novel combination of unsupervised pre-training and end-to-end supervised training for neural retrievers in open-domain QA, improving retrieval accuracy and answer extraction performance.
Findings
Achieved 84% top-20 retrieval accuracy on Natural Questions.
Outperformed recent models like DPR, REALM, and RAG in answer extraction.
Demonstrated scalability with larger models leading to consistent performance gains.
Abstract
Recent work on training neural retrievers for open-domain question answering (OpenQA) has employed both supervised and unsupervised approaches. However, it remains unclear how unsupervised and supervised methods can be used most effectively for neural retrievers. In this work, we systematically study retriever pre-training. We first propose an approach of unsupervised pre-training with the Inverse Cloze Task and masked salient spans, followed by supervised finetuning using question-context pairs. This approach leads to absolute gains of 2+ points over the previous best result in the top-20 retrieval accuracy on Natural Questions and TriviaQA datasets. We also explore two approaches for end-to-end supervised training of the reader and retriever components in OpenQA models. In the first approach, the reader considers each retrieved document separately while in the second approach, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Byte Pair Encoding · Adam · Residual Connection · Dense Connections · Linear Warmup With Linear Decay · Weight Decay
