Weakly Supervised Pre-Training for Multi-Hop Retriever
Yeon Seonwoo, Sang-Woo Lee, Ji-Hoon Kim, Jung-Woo Ha, Alice Oh

TL;DR
This paper introduces a weakly supervised pre-training method for multi-hop retrievers that reduces reliance on costly annotated datasets, improving performance and robustness in complex question answering tasks.
Contribution
The paper presents a novel weakly supervised pre-training approach for multi-hop retrievers, including a new pre-training task, scalable data generation, and a dense encoder model structure.
Findings
Pre-trained retriever outperforms several state-of-the-art models.
Effective and robust with limited data and computational resources.
Improves end-to-end multi-hop QA performance.
Abstract
In multi-hop QA, answering complex questions entails iterative document retrieval for finding the missing entity of the question. The main steps of this process are sub-question detection, document retrieval for the sub-question, and generation of a new query for the final document retrieval. However, building a dataset that contains complex questions with sub-questions and their corresponding documents requires costly human annotation. To address the issue, we propose a new method for weakly supervised multi-hop retriever pre-training without human efforts. Our method includes 1) a pre-training task for generating vector representations of complex questions, 2) a scalable data generation method that produces the nested structure of question and sub-question as weak supervision for pre-training, and 3) a pre-training model structure based on dense encoders. We conduct experiments to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
