Modeling Uncertainty Trends for Timely Retrieval in Dynamic RAG
Bo Li, Tian Tian, Zhenghua Xu, Hao Cheng, Shikun Zhang, Wei Ye

TL;DR
This paper introduces ETC, a training-free, trend-aware uncertainty modeling method that improves the timing of retrieval in dynamic RAG, leading to earlier, more accurate knowledge fetching and better performance across benchmarks.
Contribution
ETC is a novel, training-free approach that models entropy trends to determine optimal retrieval timing, enhancing dynamic RAG's efficiency and accuracy.
Findings
ETC outperforms strong baselines on six QA benchmarks.
ETC reduces retrieval frequency while maintaining or improving accuracy.
ETC generalizes well across domain-specific scenarios.
Abstract
Dynamic retrieval-augmented generation (RAG) allows large language models (LLMs) to fetch external knowledge on demand, offering greater adaptability than static RAG. A central challenge in this setting lies in determining the optimal timing for retrieval. Existing methods often trigger retrieval based on low token-level confidence, which may lead to delayed intervention after errors have already propagated. We introduce Entropy-Trend Constraint (ETC), a training-free method that determines optimal retrieval timing by modeling the dynamics of token-level uncertainty. Specifically, ETC utilizes first- and second-order differences of the entropy sequence to detect emerging uncertainty trends, enabling earlier and more precise retrieval. Experiments on six QA benchmarks with three LLM backbones demonstrate that ETC consistently outperforms strong baselines while reducing retrieval…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Information Retrieval and Search Behavior · Natural Language Processing Techniques
