DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation

Jiashuo Sun; Xianrui Zhong; Sizhe Zhou; Jiawei Han

arXiv:2505.07233·cs.CL·May 19, 2025

DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation

Jiashuo Sun, Xianrui Zhong, Sizhe Zhou, Jiawei Han

PDF

Open Access 1 Repo 2 Models 1 Datasets 3 Reviews

TL;DR

DynamicRAG introduces a reinforcement learning-based reranker that adaptively selects and orders retrieved documents for RAG systems, significantly improving performance on knowledge-intensive tasks by leveraging LLM feedback.

Contribution

It presents a novel dynamic reranking framework that adjusts retrieval based on query context, using RL and LLM response quality as supervisory signals.

Findings

01

Achieves state-of-the-art results on seven datasets

02

Demonstrates the effectiveness of adaptive reranking

03

Outperforms models with similar parameter sizes

Abstract

Retrieval-augmented generation (RAG) systems combine large language models (LLMs) with external knowledge retrieval, making them highly effective for knowledge-intensive tasks. A crucial but often under-explored component of these systems is the reranker. Since irrelevant documents in RAG systems can mislead the generator, the reranker plays a vital role in refining retrieved documents to enhance generation quality and explainability. However, it is challenging to determine the appropriate number of documents ( $k$ ) that the reranker should select: too few may result in missing critical information, while too many introduce noise and inefficiencies. Although recent studies have explored LLM-based rerankers, they primarily leverage internal model knowledge and overlook the rich supervisory signals that LLMs can provide, such as using response quality as feedback for optimizing reranking…

Peer Reviews

Decision·NeurIPS 2025 poster

Reviewer 01Rating 4Confidence 3

Strengths

**Strengths:** DynamicRAG adapts the passage budget *k* to the difficulty of each query, trimming irrelevant documents and spotlighting the most salient evidence. Its reinforcement-learning training optimises directly for answer quality, making the system more robust than fixed-threshold or static-score methods and less sensitive to hyperparameter tuning. **Weaknesses:** DynamicRAG’s reward is computed directly from the generator’s output, so the reward distribution itself shifts whenever the g

Reviewer 02Rating 4Confidence 3

Strengths

Strength: 1. Clear motivation: The authors highlight that relying on a fixed number of retrieved documents (k) inherently fails to balance the trade-off between information loss (when k is too small) and noise introduction (when k is too large). This is a valuable and often overlooked insight in existing research. 2. Targeted method design: The paper proposes a dynamic reranking mechanism built on a reinforcement learning framework, allowing the reranker to adaptively adjust both the number and

Reviewer 03Rating 4Confidence 4

Strengths

### Strengths - The two-stage training idea is straightforward to improve RAG systems with irrelevant noise. The ablation studies verify the effectiveness of each component in DynamicRAG - Leveraging LLMs' feedback as a supervision for reranking is interesting, and the use of DPO makes sense. - The empirical results look good, with many popular datasets being included. ### Weaknesses - Relaying on relatively small LLMs (LLaMA2-7B, 13B, LLaMA3-8B) for evaluation limits generality to larger, more

Code & Models

Repositories

gasolsun36/dynamicrag
pytorchOfficial

Models

Datasets

gasolsun/DynamicRAG-Eval-Data
dataset· 16 dl
16 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Multimodal Machine Learning Applications · Artificial Intelligence in Healthcare and Education

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · WordPiece