Knowing You Don't Know: Learning When to Continue Search in Multi-round RAG through Self-Practicing
Diji Yang, Linda Zeng, Jinmeng Rao, Yi Zhang

TL;DR
This paper introduces SIM-RAG, a framework that enhances multi-round retrieval in RAG systems by training a lightweight Critic to assess information sufficiency, leading to more accurate and efficient retrieval decisions without extensive supervision.
Contribution
The paper proposes a novel self-practicing training method for RAG, creating synthetic data and a lightweight Critic to improve multi-round retrieval and self-awareness in RAG systems.
Findings
SIM-RAG improves retrieval accuracy across benchmarks.
It reduces unnecessary retrieval rounds, increasing efficiency.
The framework does not require additional large-scale supervision.
Abstract
Retrieval Augmented Generation (RAG) has shown strong capability in enhancing language models' knowledge and reducing AI generative hallucinations, driving its widespread use. However, complex tasks requiring multi-round retrieval remain challenging, and early attempts tend to be overly optimistic without a good sense of self-skepticism. Current multi-round RAG systems may continue searching even when enough information has already been retrieved, or they may provide incorrect answers without having sufficient information or knowledge. Existing solutions either require large amounts of expensive human-labeled process supervision data or lead to subpar performance. This paper aims to address these limitations by introducing a new framework, SIM-RAG, to explicitly enhance RAG systems' self-awareness and multi-round retrieval capabilities. To train SIM-RAG, we first let a RAG system…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · AI-based Problem Solving and Planning
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Warmup With Linear Decay · Dropout · Layer Normalization · Byte Pair Encoding · Attention Dropout · Softmax · Residual Connection · WordPiece
