GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis

Yi Jiang; Sendong Zhao; Jianbo Li; Haochun Wang; Bing Qin

arXiv:2505.18710·cs.IR·May 27, 2025

GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis

Yi Jiang, Sendong Zhao, Jianbo Li, Haochun Wang, Bing Qin

PDF

Open Access 1 Repo 1 Video

TL;DR

GainRAG introduces a novel preference alignment method for retrieval-augmented generation, improving LLM performance by synthesizing gain signals to better match retriever outputs with LLM needs.

Contribution

The paper proposes GainRAG, a new approach that estimates gain signals to align retriever and LLM preferences using limited data, enhancing RAG system effectiveness.

Findings

01

Improved performance on 6 datasets.

02

Effective preference alignment between retriever and LLM.

03

Mitigation of degradation with pseudo-passage strategy.

Abstract

The Retrieval-Augmented Generation (RAG) framework introduces a retrieval module to dynamically inject retrieved information into the input context of large language models (LLMs), and has demonstrated significant success in various NLP tasks. However, the current study points out that there is a preference gap between retrievers and LLMs in the RAG framework, which limit the further improvement of system performance. Some highly relevant passages may interfere with LLM reasoning because they contain complex or contradictory information; while some indirectly related or even inaccurate content may help LLM generate more accurate answers by providing suggestive information or logical clues. To solve this, we propose GainRAG, a novel approach that aligns the retriever's and LLM's preferences by defining a new metric, "gain", which measure how well an input passage contributes to correct…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

liunian-jay/gainrag
pytorchOfficial

Videos

GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis· underline

Taxonomy

TopicsAdvanced Memory and Neural Computing · Neural Networks and Reservoir Computing

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Attention Dropout · Softmax · WordPiece · Weight Decay · Multi-Head Attention · Layer Normalization · Byte Pair Encoding