Optimizing Retrieval for RAG via Reinforcement Learning

Jiawei Zhou; Lei Chen

arXiv:2510.24652·cs.CL·January 5, 2026

Optimizing Retrieval for RAG via Reinforcement Learning

Jiawei Zhou, Lei Chen

PDF

TL;DR

This paper introduces R3, a reinforcement learning-based retrieval framework for RAG systems, which adaptively improves retrieval relevance in complex environments, outperforming existing methods with minimal manual tuning.

Contribution

The paper presents R3, a novel RL-based retriever that self-improves in RAG settings, addressing limitations of static relevance in traditional supervised retrievers.

Findings

01

R3 improves RAG performance by 5.2% over original retrievers.

02

R3 surpasses state-of-the-art retrievers by 4.9%.

03

Training is efficient, completed within a day on 4 GPUs.

Abstract

As retrieval-augmented generation (RAG) becomes more widespread, the role of retrieval is shifting from retrieving information for human browsing to retrieving context for AI reasoning. This shift creates more complex search environments, where relevance is difficult to pre-define. Existing retrievers rely on supervised fine-tuning (SFT) with human labels or synthetic data, resulting in static relevance that struggles to adapt to diverse RAG environments. To address this challenge, we propose R3, a Retrieval framework optimized for RAG through Reinforcement learning (RL). Specifically, we adopt an RL training paradigm that enables the retriever to explore and self-improve within given RAG environments, automating the learning process with minimal manual experimentation or tuning effort. Extensive experiments across diverse tasks demonstrate that R3 improves RAG performance by 5.2% over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.