When to Retrieve: Teaching LLMs to Utilize Information Retrieval   Effectively

Tiziano Labruna; Jon Ander Campos; Gorka Azkune

arXiv:2404.19705·cs.CL·May 8, 2024·1 cites

When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively

Tiziano Labruna, Jon Ander Campos, Gorka Azkune

PDF

Open Access 1 Repo

TL;DR

This paper introduces a training method for LLMs to decide when to retrieve external information versus relying on their memory, improving question answering efficiency by learning to generate a <RET> token when retrieval is needed.

Contribution

It proposes a novel training approach enabling LLMs to learn when to use an IR system or rely on parametric memory, enhancing question answering strategies.

Findings

01

Adapt-LLM outperforms baseline configurations.

02

The model effectively generates <RET> when retrieval is needed.

03

High accuracy achieved when relying solely on parametric memory.

Abstract

In this paper, we demonstrate how Large Language Models (LLMs) can effectively learn to use an off-the-shelf information retrieval (IR) system specifically when additional context is required to answer a given question. Given the performance of IR systems, the optimal strategy for question answering does not always entail external information retrieval; rather, it often involves leveraging the parametric memory of the LLM itself. Prior research has identified this phenomenon in the PopQA dataset, wherein the most popular questions are effectively addressed using the LLM's parametric memory, while less popular ones require IR system usage. Following this, we propose a tailored training approach for LLMs, leveraging existing open-domain question answering datasets. Here, LLMs are trained to generate a special token, <RET>, when they do not know the answer to a question. Our evaluation of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tLabruna/Adapt-LLM
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsLibrary Science and Information Systems · Artificial Intelligence in Law · Natural Language Processing Techniques