DRAGIN: Dynamic Retrieval Augmented Generation based on the Information   Needs of Large Language Models

Weihang Su; Yichen Tang; Qingyao Ai; Zhijing Wu; Yiqun Liu

arXiv:2403.10081·cs.CL·September 24, 2024·5 cites

DRAGIN: Dynamic Retrieval Augmented Generation based on the Information Needs of Large Language Models

Weihang Su, Yichen Tang, Qingyao Ai, Zhijing Wu, Yiqun Liu

PDF

Open Access 1 Repo

TL;DR

DRAGIN introduces a dynamic retrieval augmented generation framework that adaptively determines when and what information to retrieve during LLM text generation, improving performance on knowledge-intensive tasks.

Contribution

The paper proposes DRAGIN, a novel framework for real-time, adaptive retrieval decisions in LLMs, addressing limitations of static retrieval strategies.

Findings

01

DRAGIN outperforms existing methods on four knowledge-intensive datasets.

02

Dynamic retrieval decisions improve LLM generation quality.

03

Open-sourced code and models facilitate further research.

Abstract

Dynamic retrieval augmented generation (RAG) paradigm actively decides when and what to retrieve during the text generation process of Large Language Models (LLMs). There are two key elements of this paradigm: identifying the optimal moment to activate the retrieval module (deciding when to retrieve) and crafting the appropriate query once retrieval is triggered (determining what to retrieve). However, current dynamic RAG methods fall short in both aspects. Firstly, the strategies for deciding when to retrieve often rely on static rules. Moreover, the strategies for deciding what to retrieve typically limit themselves to the LLM's most recent sentence or the last few tokens, while the LLM's real-time information needs may span across the entire context. To overcome these limitations, we introduce a new framework, DRAGIN, i.e., Dynamic Retrieval Augmented Generation based on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

oneal2000/dragin
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Linear Layer · Layer Normalization · Multi-Head Attention · Linear Warmup With Linear Decay · Dropout · Byte Pair Encoding · Dense Connections · Adam