When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively
Tiziano Labruna, Jon Ander Campos, Gorka Azkune

TL;DR
This paper introduces a training method for LLMs to decide when to retrieve external information versus relying on their memory, improving question answering efficiency by learning to generate a <RET> token when retrieval is needed.
Contribution
It proposes a novel training approach enabling LLMs to learn when to use an IR system or rely on parametric memory, enhancing question answering strategies.
Findings
Adapt-LLM outperforms baseline configurations.
The model effectively generates <RET> when retrieval is needed.
High accuracy achieved when relying solely on parametric memory.
Abstract
In this paper, we demonstrate how Large Language Models (LLMs) can effectively learn to use an off-the-shelf information retrieval (IR) system specifically when additional context is required to answer a given question. Given the performance of IR systems, the optimal strategy for question answering does not always entail external information retrieval; rather, it often involves leveraging the parametric memory of the LLM itself. Prior research has identified this phenomenon in the PopQA dataset, wherein the most popular questions are effectively addressed using the LLM's parametric memory, while less popular ones require IR system usage. Following this, we propose a tailored training approach for LLMs, leveraging existing open-domain question answering datasets. Here, LLMs are trained to generate a special token, <RET>, when they do not know the answer to a question. Our evaluation of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLibrary Science and Information Systems · Artificial Intelligence in Law · Natural Language Processing Techniques
