Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions

Wenbo Wei; Nicholas Chong Jia Le; Choy Heng Lai; Ling Feng

arXiv:2505.20030·cs.LG·May 27, 2025

Multiple Descents in Deep Learning as a Sequence of Order-Chaos Transitions

Wenbo Wei, Nicholas Chong Jia Le, Choy Heng Lai, Ling Feng

PDF

Open Access

TL;DR

This paper uncovers a 'multiple-descent' phenomenon in deep learning training, linking test loss cycles to order-chaos phase transitions, and identifies optimal training points at critical phase transition moments.

Contribution

It introduces the concept of multiple descent cycles during training and connects these to phase transitions between order and chaos in neural networks.

Findings

01

Test loss exhibits multiple cycles during training.

02

Optimal training occurs at the order-chaos transition point.

03

The first transition from order to chaos yields the best model performance.

Abstract

We observe a novel 'multiple-descent' phenomenon during the training process of LSTM, in which the test loss goes through long cycles of up and down trend multiple times after the model is overtrained. By carrying out asymptotic stability analysis of the models, we found that the cycles in test loss are closely associated with the phase transition process between order and chaos, and the local optimal epochs are consistently at the critical transition point between the two phases. More importantly, the global optimal epoch occurs at the first transition from order to chaos, where the 'width' of the 'edge of chaos' is the widest, allowing the best exploration of better weight configurations for learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory