RetGen: A Joint framework for Retrieval and Grounded Text Generation Modeling
Yizhe Zhang, Siqi Sun, Xiang Gao, Yuwei Fang, Chris Brockett, Michel, Galley, Jianfeng Gao, Bill Dolan

TL;DR
RetGen introduces a joint training framework for retrieval and grounded text generation that improves factual accuracy and relevance by synergistically training a retriever and generator without requiring parallel data.
Contribution
The paper presents a novel joint training approach for retrieval and grounded text generation that reduces data constraints and enhances the quality of generated content.
Findings
Joint training improves relevance and informativeness of generated text.
The model effectively combines retrieved documents using a Mixture-of-Experts ensemble.
Both retriever and generator benefit from joint training, leading to better performance.
Abstract
Recent advances in large-scale pre-training such as GPT-3 allow seemingly high quality text to be generated from a given prompt. However, such generation systems often suffer from problems of hallucinated facts, and are not inherently designed to incorporate useful external information. Grounded generation models appear to offer remedies, but their training typically relies on rarely-available parallel data where information-relevant documents are provided for context. We propose a framework that alleviates this data constraint by jointly training a grounded generator and document retriever on the language model signal. The model learns to reward retrieval of the documents with the highest utility in generation, and attentively combines them using a Mixture-of-Experts (MoE) ensemble to generate follow-on text. We demonstrate that both generator and retriever can take advantage of this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Machine Learning in Healthcare
MethodsLinear Layer · Cosine Annealing · Dense Connections · Adam · {Dispute@FaQ-s}How to file a dispute with Expedia? · Layer Normalization · Linear Warmup With Cosine Annealing · Softmax · Multi-Head Attention · 15 Ways to Contact How can i speak to someone at Delta Airlines
