Can't Remember Details in Long Documents? You Need Some R&R

Devanshu Agrawal; Shang Gao; Martin Gajek

arXiv:2403.05004·cs.CL·March 11, 2024·1 cites

Can't Remember Details in Long Documents? You Need Some R&R

Devanshu Agrawal, Shang Gao, Martin Gajek

PDF

Open Access 1 Repo

TL;DR

This paper introduces R&R, a prompt-based method combining reprompting and in-context retrieval, to improve question-answering accuracy over long documents by maintaining relevant information closer to the instructions.

Contribution

The paper proposes R&R, a novel approach that enhances long-document QA performance by reducing information loss and enabling larger context chunks with fewer LLM calls.

Findings

01

R&R improves QA accuracy by 16 points on average.

02

R&R reduces the distance between relevant context and instructions.

03

R&R allows larger chunks with fewer LLM calls.

Abstract

Long-context large language models (LLMs) hold promise for tasks such as question-answering (QA) over long documents, but they tend to miss important information in the middle of context documents (arXiv:2307.03172v3). Here, we introduce $\textit{R&R}$ -- a combination of two novel prompt-based methods called $reprompting$ and $in-context retrieval$ (ICR) -- to alleviate this effect in document-based QA. In reprompting, we repeat the prompt instructions periodically throughout the context document to remind the LLM of its original task. In ICR, rather than instructing the LLM to answer the question directly, we instruct it to retrieve the top $k$ passage numbers most relevant to the given question, which are then used as an abbreviated context in a second QA prompt. We test R&R with GPT-4 Turbo and Claude-2.1 on documents up to 80k tokens in length and observe a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

casetext/r-and-r
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBig Data and Business Intelligence

MethodsAttention Is All You Need · Linear Layer · Dropout · Multi-Head Attention · Position-Wise Feed-Forward Layer · Layer Normalization · Absolute Position Encodings · Softmax · Dense Connections · Label Smoothing