RealTime QA: What's the Answer Right Now?

Jungo Kasai; Keisuke Sakaguchi; Yoichi Takahashi; Ronan Le Bras; Akari; Asai; Xinyan Yu; Dragomir Radev; Noah A. Smith; Yejin Choi; Kentaro Inui

arXiv:2207.13332·cs.CL·February 29, 2024·26 cites

RealTime QA: What's the Answer Right Now?

Jungo Kasai, Keisuke Sakaguchi, Yoichi Takahashi, Ronan Le Bras, Akari, Asai, Xinyan Yu, Dragomir Radev, Noah A. Smith, Yejin Choi, Kentaro Inui

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

REALTIME QA is a dynamic, real-time question answering benchmark that evaluates systems on current events, emphasizing the importance of up-to-date information retrieval and updating capabilities.

Contribution

This paper introduces a novel real-time QA benchmark and evaluates large pretrained models, highlighting challenges in handling outdated information and the need for improved retrieval strategies.

Findings

01

GPT-3 often updates answers with new info from retrieval

02

Retrieval quality significantly impacts answer accuracy

03

Identifies the need for systems to detect unanswerable or outdated info

Abstract

We introduce REALTIME QA, a dynamic question answering (QA) platform that announces questions and evaluates systems on a regular basis (weekly in this version). REALTIME QA inquires about the current world, and QA systems need to answer questions about novel events or information. It therefore challenges static, conventional assumptions in open-domain QA datasets and pursues instantaneous applications. We build strong baseline models upon large pretrained language models, including GPT-3 and T5. Our benchmark is an ongoing effort, and this paper presents real-time evaluation results over the past year. Our experimental results show that GPT-3 can often properly update its generation results, based on newly-retrieved documents, highlighting the importance of up-to-date information retrieval. Nonetheless, we find that GPT-3 tends to return outdated answers when retrieved documents do not…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

realtimeqa/realtimeqa_public
noneOfficial

Datasets

monsoon-nlp/relive-qa
dataset· 219 dl
219 dl

Videos

RealTime QA: What's the Answer Right Now?· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research