Critical Phase Transition in Large Language Models

Kai Nakaishi; Yoshihiko Nishikawa; Koji Hukushima

arXiv:2406.05335·cond-mat.dis-nn·October 23, 2024·2 cites

Critical Phase Transition in Large Language Models

Kai Nakaishi, Yoshihiko Nishikawa, Koji Hukushima

PDF

Open Access 3 Reviews

TL;DR

This paper investigates whether large language models exhibit phase transitions, revealing that at certain temperature settings, their behavior changes qualitatively, akin to physical phase transitions, with implications for understanding their dynamics.

Contribution

The study provides the first analysis of phase transition phenomena in LLMs, identifying critical points and behaviors similar to natural phase transitions in physical systems.

Findings

01

Divergent statistical quantities at specific temperature points

02

Power-law decay of correlations near the transition

03

Slow convergence to stationary states in LLMs

Abstract

Large Language Models (LLMs) have demonstrated impressive performance. To understand their behaviors, we need to consider the fact that LLMs sometimes show qualitative changes. The natural world also presents such changes called phase transitions, which are defined by singular, divergent statistical quantities. Therefore, an intriguing question is whether qualitative changes in LLMs are phase transitions. In this work, we have conducted extensive analysis on texts generated by LLMs and suggested that a phase transition occurs in LLMs when varying the temperature parameter. Specifically, statistical quantities have divergent properties just at the point between the low-temperature regime, where LLMs generate sentences with clear repetitive structures, and the high-temperature regime, where generated sentences are often incomprehensible. In addition, critical behaviors near the phase…

Peer Reviews

Decision·Submitted to ICLR 2025

Reviewer 01Rating 3Confidence 4

Strengths

* The motivation behind the paper is nice: the application of a well-studied concept from physics can perhaps allow us to use knowledge about/properties of that concept to better understand natural language and language models * The work provides a comparison of quantitative aspects of human- and machine-generated language, an approach that is more objective than the qualitative comparisons that are often done and thus perhaps a better ground from which to draw conclusions * The finding that th

Weaknesses

* The paper is generally difficult to follow: * There is confusing terminology that isnt defined/contextualized before it is used, e.g., “long-range correlation” in the introduction). This will likely confuse most readers (myself included) * The implications of the observed critical properties are incredibly unclear. See subsequent questions for the parts that felt particularly unclear to me, although this is not comprehensive. As such, it is difficult to draw meaningful conclusions from

Reviewer 02Rating 6Confidence 2

Strengths

This paper investigates whether LLMs go through phase transition with the temperature parameter. This (I believe) contrasts with much of the literature, which is about the possibility of a phase transition with the size of the model. The statistics that the authors introduce may have value for future work that aims to study structural features of LLM generations. The writing and organization are clear, and the figures are clearly explained.

Weaknesses

- There are obvious empirical concerns: namely, in the main paper, the authors only study GPT-2 small, and all of the analysis is specifically about the structure of where the proper noun PROPN tag occurs in generated text. This must be quite narrow, and it would be more convincing if the authors could summarize the results for other models and other POS tags in the main paper (they are referenced as being in the Appendix, which I did not read). - This is very far from my area of expertise, but

Reviewer 03Rating 6Confidence 3

Strengths

1. The project fosters the understanding of LLMs from an interesting angle of phase transition. 2. The experiments are well designed and cleared documented.

Weaknesses

1. While LLMs develop fast over the past few years, this work tests on GPT-2, instead of newer generations of models.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Language and cultural evolution

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Cosine Annealing · Layer Normalization · Byte Pair Encoding · Adam · Attention Dropout · Weight Decay · Linear Warmup With Cosine Annealing · Linear Layer