RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for   Short-form Open-Domain Question Answering

Zihan Zhang; Meng Fang; Ling Chen

arXiv:2402.16457·cs.CL·June 6, 2024·1 cites

RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering

Zihan Zhang, Meng Fang, Ling Chen

PDF

Open Access 1 Repo 2 Datasets

TL;DR

This paper introduces RetrievalQA, a benchmark for evaluating adaptive retrieval-augmented generation in short-form open-domain question answering, and proposes a new method, TA-ARE, to improve retrieval necessity assessment without calibration.

Contribution

It provides a new benchmark dataset for evaluating ARAG methods and proposes TA-ARE, a calibration-free approach for better retrieval decision-making in LLMs.

Findings

01

Calibration-based methods need threshold tuning.

02

Vanilla prompting is insufficient for reliable retrieval decisions.

03

TA-ARE effectively assesses retrieval necessity without extra training.

Abstract

Adaptive retrieval-augmented generation (ARAG) aims to dynamically determine the necessity of retrieval for queries instead of retrieving indiscriminately to enhance the efficiency and relevance of the sourced information. However, previous works largely overlook the evaluation of ARAG approaches, leading to their effectiveness being understudied. This work presents a benchmark, RetrievalQA, comprising 1,271 short-form questions covering new world and long-tail knowledge. The knowledge necessary to answer the questions is absent from LLMs; therefore, external information must be retrieved to answer correctly. This makes RetrievalQA a suitable testbed to evaluate existing ARAG methods. We observe that calibration-based methods heavily rely on threshold tuning, while vanilla prompting is inadequate for guiding LLMs to make reliable retrieval decisions. Based on our findings, we propose…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hyintell/retrievalqa
pytorchOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Expert finding and Q&A systems · Intelligent Tutoring Systems and Adaptive Learning