Insights into LLM Long-Context Failures: When Transformers Know but   Don't Tell

Taiming Lu; Muhan Gao; Kuai Yu; Adam Byerly; Daniel Khashabi

arXiv:2406.14673·cs.CL·October 8, 2024

Insights into LLM Long-Context Failures: When Transformers Know but Don't Tell

Taiming Lu, Muhan Gao, Kuai Yu, Adam Byerly, Daniel Khashabi

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates why large language models struggle with long contexts, revealing they encode positional information but often fail to use it effectively during response generation, highlighting a disconnect in their reasoning process.

Contribution

The study uncovers the 'know but don't tell' phenomenon in LLMs, providing new insights into their internal representations and reasoning failures with long contexts.

Findings

01

LLMs encode positional information of target data

02

LLMs often fail to leverage positional info in responses

03

Analysis of extraction time correlates with accuracy

Abstract

Large Language Models (LLMs) exhibit positional bias, struggling to utilize information from the middle or end of long contexts. Our study explores LLMs' long-context reasoning by probing their hidden representations. We find that while LLMs encode the position of target information, they often fail to leverage this in generating accurate responses. This reveals a disconnect between information retrieval and utilization, a "know but don't tell" phenomenon. We further analyze the relationship between extraction time and final accuracy, offering insights into the underlying mechanics of transformer models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TaiMingLu/know-dont-tell
pytorchOfficial

Videos

Insights into LLM Long-Context Failures: When Transformers Know but Don’t Tell· underline

Taxonomy

TopicsAdvancements in Photolithography Techniques · Electricity Theft Detection Techniques