Generator-Retriever-Generator Approach for Open-Domain Question Answering
Abdelrahman Abdallah, Adam Jatowt

TL;DR
The paper introduces a Generator-Retriever-Generator (GRG) method that combines document generation and retrieval with large language models to improve open-domain question answering accuracy.
Contribution
It presents a novel hybrid approach that integrates document generation and retrieval, outperforming existing pipelines on multiple QA datasets.
Findings
Outperforms state-of-the-art methods on TriviaQA, NQ, and WebQ datasets.
Improves performance metrics by at least +1.6 to +5.2 points.
Demonstrates effectiveness of combining generated and retrieved documents.
Abstract
Open-domain question answering (QA) tasks usually require the retrieval of relevant information from a large corpus to generate accurate answers. We propose a novel approach called Generator-Retriever-Generator (GRG) that combines document retrieval techniques with a large language model (LLM), by first prompting the model to generate contextual documents based on a given question. In parallel, a dual-encoder network retrieves documents that are relevant to the question from an external corpus. The generated and retrieved documents are then passed to the second LLM, which generates the final answer. By combining document retrieval and LLM generation, our approach addresses the challenges of open-domain QA, such as generating informative and contextually relevant answers. GRG outperforms the state-of-the-art generate-then-read and retrieve-then-read pipelines (GENREAD and RFiD) improving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications
