Unsupervised Information Refinement Training of Large Language Models   for Retrieval-Augmented Generation

Shicheng Xu; Liang Pang; Mo Yu; Fandong Meng; Huawei Shen; Xueqi; Cheng; Jie Zhou

arXiv:2402.18150·cs.CL·June 13, 2024·1 cites

Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation

Shicheng Xu, Liang Pang, Mo Yu, Fandong Meng, Huawei Shen, Xueqi, Cheng, Jie Zhou

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces InFO-RAG, an unsupervised training method that enhances large language models' ability to refine and utilize retrieved information effectively across diverse tasks, improving performance and robustness.

Contribution

It proposes a novel unsupervised training approach, InFO-RAG, that treats LLMs as information refiners to better integrate retrieved texts regardless of their quality.

Findings

01

InFO-RAG improves LLaMA2's performance by 9.39% on average across 11 datasets.

02

InFO-RAG enhances in-context learning capabilities.

03

InFO-RAG increases robustness of retrieval-augmented generation.

Abstract

Retrieval-augmented generation (RAG) enhances large language models (LLMs) by incorporating additional information from retrieval. However, studies have shown that LLMs still face challenges in effectively using the retrieved information, even ignoring it or being misled by it. The key reason is that the training of LLMs does not clearly make LLMs learn how to utilize input retrieved texts with varied quality. In this paper, we propose a novel perspective that considers the role of LLMs in RAG as ``Information Refiner'', which means that regardless of correctness, completeness, or usefulness of retrieved texts, LLMs can consistently integrate knowledge within the retrieved texts and model parameters to generate the texts that are more concise, accurate, and complete than the retrieved texts. To this end, we propose an information refinement training method named InFO-RAG that optimizes…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xsc1234/info-rag
pytorchOfficial

Videos

Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation· underline

Taxonomy

TopicsTopic Modeling

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Dropout · Linear Warmup With Linear Decay · WordPiece · Residual Connection · Linear Layer · Weight Decay · BERT · Dropout · Layer Normalization