A Replication Study of Dense Passage Retriever
Xueguang Ma, Kai Sun, Ronak Pradeep, and Jimmy Lin

TL;DR
This study replicates the dense passage retriever (DPR) approach for open-domain question answering, confirming its effectiveness and revealing that hybrid retrieval methods outperform BM25 alone, with improvements from better evidence integration.
Contribution
It provides an independent validation of DPR, highlights under-reported baseline effectiveness, and demonstrates enhanced QA performance through improved evidence integration.
Findings
BM25 baseline effectiveness was under-reported in the original study.
Hybrid retrieval methods outperform BM25 alone.
Incorporating evidence and improved scoring enhances QA results.
Abstract
Text retrieval using learned dense representations has recently emerged as a promising alternative to "traditional" text retrieval using sparse bag-of-words representations. One recent work that has garnered much attention is the dense passage retriever (DPR) technique proposed by Karpukhin et al. (2020) for end-to-end open-domain question answering. We present a replication study of this work, starting with model checkpoints provided by the authors, but otherwise from an independent implementation in our group's Pyserini IR toolkit and PyGaggle neural text ranking library. Although our experimental results largely verify the claims of the original paper, we arrived at two important additional findings that contribute to a better understanding of DPR: First, it appears that the original authors under-report the effectiveness of the BM25 baseline and hence also dense--sparse hybrid…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Algorithms and Data Compression
