InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
Zhepei Wei, Wei-Lin Chen, Yu Meng

TL;DR
InstructRAG enhances retrieval-augmented generation by explicitly teaching language models to explain how answers are derived from retrieved documents using self-synthesized rationales, improving accuracy without extra supervision.
Contribution
It introduces a novel method where LMs learn explicit denoising through self-generated rationales, eliminating the need for costly supervision and improving generation quality.
Findings
Outperforms existing RAG methods with 8.3% relative improvement.
Scales well with more retrieved documents.
Shows robust denoising in out-of-domain datasets.
Abstract
Retrieval-augmented generation (RAG) has shown promising potential to enhance the accuracy and factuality of language models (LMs). However, imperfect retrievers or noisy corpora can introduce misleading or even erroneous information to the retrieved contents, posing a significant challenge to the generation quality. Existing RAG methods typically address this challenge by directly predicting final answers despite potentially noisy inputs, resulting in an implicit denoising process that is difficult to interpret and verify. On the other hand, the acquisition of explicit denoising supervision is often costly, involving significant human efforts. In this work, we propose InstructRAG, where LMs explicitly learn the denoising process through self-synthesized rationales -- First, we instruct the LM to explain how the ground-truth answer is derived from retrieved documents. Then, these…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Video Analysis and Summarization · Advanced Image and Video Retrieval Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · WordPiece · Residual Connection · Weight Decay · Softmax · Layer Normalization · Byte Pair Encoding · Attention Dropout · Linear Warmup With Linear Decay
