TL;DR
This paper investigates how self-attention models exhibit spontaneous topic changes, comparing their dynamics to human spontaneous thought through theoretical analysis and empirical validation in modern large language models.
Contribution
It introduces a theoretical framework for understanding spontaneous topic shifts in self-attention models and empirically demonstrates these dynamics in state-of-the-art LLMs, highlighting differences from human cognition.
Findings
Self-attention models maintain token priority order related to input topics.
Spontaneous topic change occurs when lower-priority tokens outnumber higher-priority ones.
Longer context or ambiguous topics reduce spontaneous change likelihood.
Abstract
Human cognition is punctuated by abrupt, spontaneous shifts between topics-driven by emotional, contextual, or associative cues-a phenomenon known as spontaneous thought in neuroscience. In contrast, self-attention based models depend on structured patterns over their inputs to predict each next token, lacking spontaneity. Motivated by this distinction, we characterize spontaneous topic changes in self-attention architectures, revealing both their similarities and their divergences from spontaneous human thought. First, we establish theoretical results under a simplified, single-layer self-attention model with suitable conditions by defining the topic as a set of Token Priority Graphs (TPGs). Specifically, we demonstrate that (1) the model maintains the priority order of tokens related to the input topic, (2) a spontaneous topic change can occur only if lower-priority tokens outnumber…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
MethodsFocus · Sparse Evolutionary Training
