DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models
Weihang Su, Yichen Tang, Qingyao Ai, Zhijing Wu, Yiqun Liu

TL;DR
DRAGIN introduces a dynamic retrieval augmented generation framework that adaptively determines when and what information to retrieve during LLM text generation, improving performance on knowledge-intensive tasks.
Contribution
The paper proposes DRAGIN, a novel framework for real-time, adaptive retrieval decisions in LLMs, addressing limitations of static retrieval strategies.
Findings
DRAGIN outperforms existing methods on four knowledge-intensive datasets.
Dynamic retrieval decisions improve LLM generation quality.
Open-sourced code and models facilitate further research.
Abstract
Dynamic retrieval augmented generation (RAG) paradigm actively decides when and what to retrieve during the text generation process of Large Language Models (LLMs). There are two key elements of this paradigm: identifying the optimal moment to activate the retrieval module (deciding when to retrieve) and crafting the appropriate query once retrieval is triggered (determining what to retrieve). However, current dynamic RAG methods fall short in both aspects. Firstly, the strategies for deciding when to retrieve often rely on static rules. Moreover, the strategies for deciding what to retrieve typically limit themselves to the LLM's most recent sentence or the last few tokens, while the LLM's real-time information needs may span across the entire context. To overcome these limitations, we introduce a new framework, DRAGIN, i.e., Dynamic Retrieval Augmented Generation based on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Layer Normalization · Multi-Head Attention · Linear Warmup With Linear Decay · Dropout · Byte Pair Encoding · Dense Connections · Adam
