Optimizing Retrieval for RAG via Reinforcement Learning
Jiawei Zhou, Lei Chen

TL;DR
This paper introduces R3, a reinforcement learning-based retrieval framework for RAG systems, which adaptively improves retrieval relevance in complex environments, outperforming existing methods with minimal manual tuning.
Contribution
The paper presents R3, a novel RL-based retriever that self-improves in RAG settings, addressing limitations of static relevance in traditional supervised retrievers.
Findings
R3 improves RAG performance by 5.2% over original retrievers.
R3 surpasses state-of-the-art retrievers by 4.9%.
Training is efficient, completed within a day on 4 GPUs.
Abstract
As retrieval-augmented generation (RAG) becomes more widespread, the role of retrieval is shifting from retrieving information for human browsing to retrieving context for AI reasoning. This shift creates more complex search environments, where relevance is difficult to pre-define. Existing retrievers rely on supervised fine-tuning (SFT) with human labels or synthetic data, resulting in static relevance that struggles to adapt to diverse RAG environments. To address this challenge, we propose R3, a Retrieval framework optimized for RAG through Reinforcement learning (RL). Specifically, we adopt an RL training paradigm that enables the retriever to explore and self-improve within given RAG environments, automating the learning process with minimal manual experimentation or tuning effort. Extensive experiments across diverse tasks demonstrate that R3 improves RAG performance by 5.2% over…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
