How Focused Are LLMs? A Quantitative Study via Repetitive Deterministic Prediction Tasks
Wanda Hou, Leon Zhou, Hong-Ye Hu, Yubei Chen, Yi-Zhuang You, and Xiao-Liang Qi

TL;DR
This study examines how large language models perform on repetitive deterministic tasks, revealing a sharp accuracy decline beyond a certain sequence length and proposing a physics-inspired model to explain this failure.
Contribution
The paper introduces a statistical physics-inspired model that explains the exponential decay in accuracy and the accuracy cliff in large language models on deterministic tasks.
Findings
Accuracy drops sharply beyond a characteristic length.
Models fail to perform independent operations reliably.
The proposed model captures the transition and explains internal interference.
Abstract
We investigate the performance of large language models on repetitive deterministic prediction tasks and study how the sequence accuracy rate scales with output length. Each such task involves repeating the same operation n times. Examples include letter replacement in strings following a given rule, integer addition, and multiplication of string operators in many body quantum mechanics. If the model performs the task through a simple repetition algorithm, the success rate should decay exponentially with sequence length. In contrast, our experiments on leading large language models reveal a sharp double exponential drop beyond a characteristic length scale, forming an accuracy cliff that marks the transition from reliable to unstable generation. This indicates that the models fail to execute each operation independently. To explain this phenomenon, we propose a statistical physics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Artificial Intelligence in Healthcare and Education · Topic Modeling
