Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation
Shicheng Xu, Liang Pang, Mo Yu, Fandong Meng, Huawei Shen, Xueqi, Cheng, Jie Zhou

TL;DR
This paper introduces InFO-RAG, an unsupervised training method that enhances large language models' ability to refine and utilize retrieved information effectively across diverse tasks, improving performance and robustness.
Contribution
It proposes a novel unsupervised training approach, InFO-RAG, that treats LLMs as information refiners to better integrate retrieved texts regardless of their quality.
Findings
InFO-RAG improves LLaMA2's performance by 9.39% on average across 11 datasets.
InFO-RAG enhances in-context learning capabilities.
InFO-RAG increases robustness of retrieval-augmented generation.
Abstract
Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating additional information from retrieval. However, studies have shown that LLMs still face challenges in effectively using the retrieved information, even ignoring it or being misled by it. The key reason is that the training of LLMs does not clearly make LLMs learn how to utilize input retrieved texts with varied quality. In this paper, we propose a novel perspective that considers the role of LLMs in RAG as ``Information Refiner'', which means that regardless of correctness, completeness, or usefulness of retrieved texts, LLMs can consistently integrate knowledge within the retrieved texts and model parameters to generate the texts that are more concise, accurate, and complete than the retrieved texts. To this end, we propose an information refinement training method named InFO-RAG that optimizes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Warmup With Linear Decay · WordPiece · Residual Connection · Linear Layer · Weight Decay · BERT · Dropout · Layer Normalization
