Attention Overflow: Language Model Input Blur during Long-Context Missing Items Recommendation
Damien Sileo

TL;DR
This paper investigates the attention overflow problem in large language models, where their ability to suggest missing list items degrades with many inputs, impacting list completion and recommendation tasks.
Contribution
It identifies and analyzes the attention overflow phenomenon in LLMs, highlighting its effects on performance with long inputs and discussing mitigation strategies.
Findings
Performance drops around 100 items in input lists.
Iterative loops can reduce repetition but increase computational costs.
Attention overflow limits LLMs' effectiveness in long-list tasks.
Abstract
Large language models (LLMs) can suggest missing elements from items listed in a prompt, which can be used for list completion or recommendations based on users' history. However, their performance degrades when presented with too many items, as they start to suggest items already included in the input list. This occurs at around 100 items for mid-2024 flagship LLMs. We evaluate this phenomenon on both synthetic problems (e.g., finding missing numbers in a given range of shuffled integers) and realistic movie recommendation scenarios. We refer to this issue as \textit{attention overflow}, as preventing repetition requires attending to all items simultaneously. Although iterative loops can mitigate this problem, their costs increase with the repetition rate, affecting the language models' ability to derive novelty from lengthy inputs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Recommender Systems and Techniques
