States Hidden in Hidden States: LLMs Emerge Discrete State Representations Implicitly
Junhao Chen, Shengding Hu, Zhiyuan Liu, Maosong Sun

TL;DR
This paper uncovers that large language models internally develop implicit discrete state representations enabling extended symbolic calculations, revealing new insights into their internal mechanisms and emergent abilities.
Contribution
The study demonstrates the existence of implicit discrete state representations in LLMs and explores their formation and role in symbolic calculations, a novel insight into model internal workings.
Findings
LLMs can perform extended multi-addition calculations without chain-of-thought.
IDSRs exist within hidden states and are used for symbolic reasoning.
Current models' state representations are not fully lossless, causing inaccuracies.
Abstract
Large Language Models (LLMs) exhibit various emergent abilities. Among these abilities, some might reveal the internal working mechanisms of models. In this paper, we uncover a novel emergent capability in models: the intrinsic ability to perform extended sequences of calculations without relying on chain-of-thought step-by-step solutions. Remarkably, the most advanced models can directly output the results of two-digit number additions with lengths extending up to 15 addends. We hypothesize that the model emerges Implicit Discrete State Representations (IDSRs) within its hidden states and performs symbolic calculations internally. To test this hypothesis, we design a sequence of experiments that look into the hidden states. Specifically, we first confirm that IDSRs exist. Then, we provide interesting observations about the formation of IDSRs from layer, digit, and sequence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management
