Attention Overflow: Language Model Input Blur during Long-Context   Missing Items Recommendation

Damien Sileo

arXiv:2407.13481·cs.CL·July 19, 2024

Attention Overflow: Language Model Input Blur during Long-Context Missing Items Recommendation

Damien Sileo

PDF

Open Access 1 Datasets

TL;DR

This paper investigates the attention overflow problem in large language models, where their ability to suggest missing list items degrades with many inputs, impacting list completion and recommendation tasks.

Contribution

It identifies and analyzes the attention overflow phenomenon in LLMs, highlighting its effects on performance with long inputs and discussing mitigation strategies.

Findings

01

Performance drops around 100 items in input lists.

02

Iterative loops can reduce repetition but increase computational costs.

03

Attention overflow limits LLMs' effectiveness in long-list tasks.

Abstract

Large language models (LLMs) can suggest missing elements from items listed in a prompt, which can be used for list completion or recommendations based on users' history. However, their performance degrades when presented with too many items, as they start to suggest items already included in the input list. This occurs at around 100 items for mid-2024 flagship LLMs. We evaluate this phenomenon on both synthetic problems (e.g., finding missing numbers in a given range of shuffled integers) and realistic movie recommendation scenarios. We refer to this issue as \textit{attention overflow}, as preventing repetition requires attending to all items simultaneously. Although iterative loops can mitigate this problem, their costs increase with the repetition rate, affecting the language models' ability to derive novelty from lengthy inputs.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

sileod/missing-item-prediction
dataset· 81 dl
81 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning in Healthcare · Recommender Systems and Techniques