RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation

Shi-Qi Yan; Quan Liu; Zhen-Hua Ling

arXiv:2501.13726·cs.CL·October 13, 2025

RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation

Shi-Qi Yan, Quan Liu, Zhen-Hua Ling

PDF

Open Access 1 Video

TL;DR

This paper introduces Retrieval Preference Optimization (RPO), a novel alignment method for retrieval-augmented generation that improves response accuracy by adaptively leveraging multi-source knowledge based on retrieval relevance.

Contribution

RPO is the first RAG-specific alignment approach that explicitly quantifies retrieval relevance during training, enhancing response accuracy without additional components.

Findings

01

RPO outperforms existing RAG methods by 4-10% in accuracy.

02

RPO demonstrates robust generalization across four datasets.

03

It effectively integrates retrieval evaluation into response generation.

Abstract

While Retrieval-Augmented Generation (RAG) has exhibited promise in utilizing external knowledge, its generation process heavily depends on the quality and accuracy of the retrieved context. Large language models (LLMs) struggle to evaluate the correctness of non-parametric knowledge retrieved externally when it differs from internal memorization, leading to knowledge conflicts during response generation. To this end, we introduce the Retrieval Preference Optimization (RPO), a lightweight and effective alignment method to adaptively leverage multi-source knowledge based on retrieval relevance. An implicit representation of retrieval relevance is derived and incorporated into the reward model to integrate retrieval evaluation and response generation into a single model, solving the problem that previous methods necessitate the additional procedure to assess the retrieval quality.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation· underline

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Layer Normalization · Dense Connections · Adam · Softmax · Linear Warmup With Linear Decay · Residual Connection · Dropout · Byte Pair Encoding