Lost in the Middle: How Language Models Use Long Contexts
Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape and, Michele Bevilacqua, Fabio Petroni, Percy Liang

TL;DR
This paper investigates how current language models utilize long input contexts, revealing that their performance drops when relevant information is located in the middle of long inputs, highlighting limitations in their context usage.
Contribution
The study provides a detailed analysis of language models' performance with long contexts, identifying positional biases and proposing new evaluation protocols for future models.
Findings
Performance peaks at beginning or end of context
Significant degradation when relevant info is in the middle
Current models do not robustly utilize long contexts
Abstract
While recent language models have the ability to take long contexts as input, relatively little is known about how well they use longer context. We analyze the performance of language models on two tasks that require identifying relevant information in their input contexts: multi-document question answering and key-value retrieval. We find that performance can degrade significantly when changing the position of relevant information, indicating that current language models do not robustly make use of information in long input contexts. In particular, we observe that performance is often highest when relevant information occurs at the beginning or end of the input context, and significantly degrades when models must access relevant information in the middle of long contexts, even for explicitly long-context models. Our analysis provides a better understanding of how language models use…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗Dongjin-kr/ko-rerankermodel· 12k dl· ♡ 7112k dl♡ 71
- 🤗togethercomputer/Llama-2-7B-32K-Instructmodel· 2.3k dl· ♡ 1602.3k dl♡ 160
- 🤗TheBloke/Llama-2-7B-32K-Instruct-GPTQmodel· 31 dl· ♡ 2731 dl♡ 27
- 🤗TheBloke/Llama-2-7B-32K-Instruct-GGMLmodel· 4 dl· ♡ 84 dl♡ 8
- 🤗TheBloke/Llama-2-7B-32K-Instruct-GGUFmodel· 1.4k dl· ♡ 541.4k dl♡ 54
- 🤗TheBloke/Llama-2-7B-32K-Instruct-AWQmodel· 28 dl· ♡ 228 dl♡ 2
- 🤗kallebysantos/Llama-2-7B-32K-Instruct-GGUFmodel· ♡ 1♡ 1
- 🤗lightonai/alfred-40b-1023model· 1.5k dl· ♡ 491.5k dl♡ 49
- 🤗TheBloke/alfred-40B-1023-GPTQmodel· 15 dl· ♡ 315 dl♡ 3
- 🤗TheBloke/alfred-40B-1023-GGUFmodel· 522 dl· ♡ 5522 dl♡ 5
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
