Unified Active Retrieval for Retrieval Augmented Generation
Qinyuan Cheng, Xiaonan Li, Shimin Li, Qin Zhu, Zhangyue Yin, Yunfan, Shao, Linyang Li, Tianxiang Sun, Hang Yan, Xipeng Qiu

TL;DR
This paper introduces Unified Active Retrieval (UAR), a novel approach that uses four criteria for better retrieval timing decisions in Retrieval-Augmented Generation, improving performance and reducing complexity.
Contribution
The paper proposes a unified, plug-and-play framework with four criteria for active retrieval, simplifying the process and enhancing effectiveness across diverse instruction types.
Findings
UAR outperforms existing methods in retrieval timing accuracy
UAR improves downstream task performance
UAR reduces inference latency
Abstract
In Retrieval-Augmented Generation (RAG), retrieval is not always helpful and applying it to every instruction is sub-optimal. Therefore, determining whether to retrieve is crucial for RAG, which is usually referred to as Active Retrieval. However, existing active retrieval methods face two challenges: 1. They usually rely on a single criterion, which struggles with handling various types of instructions. 2. They depend on specialized and highly differentiated procedures, and thus combining them makes the RAG system more complicated and leads to higher response latency. To address these challenges, we propose Unified Active Retrieval (UAR). UAR contains four orthogonal criteria and casts them into plug-and-play classification tasks, which achieves multifaceted retrieval timing judgements with negligible extra inference cost. We further introduce the Unified Active Retrieval Criteria…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInformation Retrieval and Search Behavior
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · WordPiece · Residual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Attention Dropout · Linear Warmup With Linear Decay · Weight Decay · Dropout
