Dynamics of Spontaneous Topic Changes in Next Token Prediction with Self-Attention

Mumin Jia; Jairo Diaz-Rodriguez

arXiv:2501.06382·cs.CL·December 15, 2025

Dynamics of Spontaneous Topic Changes in Next Token Prediction with Self-Attention

Mumin Jia, Jairo Diaz-Rodriguez

PDF

1 Video

TL;DR

This paper investigates how self-attention models exhibit spontaneous topic changes, comparing their dynamics to human spontaneous thought through theoretical analysis and empirical validation in modern large language models.

Contribution

It introduces a theoretical framework for understanding spontaneous topic shifts in self-attention models and empirically demonstrates these dynamics in state-of-the-art LLMs, highlighting differences from human cognition.

Findings

01

Self-attention models maintain token priority order related to input topics.

02

Spontaneous topic change occurs when lower-priority tokens outnumber higher-priority ones.

03

Longer context or ambiguous topics reduce spontaneous change likelihood.

Abstract

Human cognition is punctuated by abrupt, spontaneous shifts between topics-driven by emotional, contextual, or associative cues-a phenomenon known as spontaneous thought in neuroscience. In contrast, self-attention based models depend on structured patterns over their inputs to predict each next token, lacking spontaneity. Motivated by this distinction, we characterize spontaneous topic changes in self-attention architectures, revealing both their similarities and their divergences from spontaneous human thought. First, we establish theoretical results under a simplified, single-layer self-attention model with suitable conditions by defining the topic as a set of Token Priority Graphs (TPGs). Specifically, we demonstrate that (1) the model maintains the priority order of tokens related to the input topic, (2) a spontaneous topic change can occur only if lower-priority tokens outnumber…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Dynamics of Spontaneous Topic Changes in Next Token Prediction with Self-Attention· slideslive

Taxonomy

MethodsFocus · Sparse Evolutionary Training