Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization
Hamed Zamani, Michael Bendersky

TL;DR
This paper presents Stochastic RAG, an end-to-end training method for retrieval-augmented generation models that improves performance by modeling retrieval as a stochastic sampling process and optimizing it directly.
Contribution
It introduces a novel stochastic sampling approach for RAG, enabling end-to-end training without assuming document independence, and achieves state-of-the-art results on multiple datasets.
Findings
Outperforms previous models on six of seven datasets
Effective end-to-end optimization of RAG models
Versatile across diverse NLP tasks
Abstract
This paper introduces Stochastic RAG--a novel approach for end-to-end optimization of retrieval-augmented generation (RAG) models that relaxes the simplifying assumptions of marginalization and document independence, made in most prior work. Stochastic RAG casts the retrieval process in RAG as a stochastic sampling without replacement process. Through this formulation, we employ straight-through Gumbel-top-k that provides a differentiable approximation for sampling without replacement and enables effective end-to-end optimization for RAG. We conduct extensive experiments on seven diverse datasets on a wide range of tasks, from open-domain question answering to fact verification to slot-filling for relation extraction and to dialogue systems. By applying this optimization method to a recent and effective RAG model, we advance state-of-the-art results on six out of seven datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Weight Decay · Attention Dropout · Dropout · Residual Connection · Softmax · WordPiece · Linear Layer · Byte Pair Encoding
