RocketQAv2: A Joint Training Method for Dense Passage Retrieval and Passage Re-ranking
Ruiyang Ren, Yingqi Qu, Jing Liu, Wayne Xin Zhao, Qiaoqiao She, Hua, Wu, Haifeng Wang, Ji-Rong Wen

TL;DR
RocketQAv2 introduces a joint training method for dense passage retrieval and re-ranking, utilizing dynamic listwise distillation and hybrid data augmentation to improve performance on key NLP retrieval tasks.
Contribution
It presents a novel unified listwise training approach for both retrieval and re-ranking models, enabling mutual enhancement during training.
Findings
Significant performance improvements on MSMARCO and Natural Questions datasets.
Effective dynamic distillation enhances both retriever and re-ranker.
Hybrid data augmentation diversifies training instances for better generalization.
Abstract
In various natural language processing tasks, passage retrieval and passage re-ranking are two key procedures in finding and ranking relevant information. Since both the two procedures contribute to the final performance, it is important to jointly optimize them in order to achieve mutual improvement. In this paper, we propose a novel joint training approach for dense passage retrieval and passage re-ranking. A major contribution is that we introduce the dynamic listwise distillation, where we design a unified listwise training approach for both the retriever and the re-ranker. During the dynamic distillation, the retriever and the re-ranker can be adaptively improved according to each other's relevance information. We also propose a hybrid data augmentation strategy to construct diverse training instances for listwise training approach. Extensive experiments show the effectiveness of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies
