How Does Learning Rate Decay Help Modern Neural Networks?
Kaichao You, Mingsheng Long, Jianmin Wang, Michael I. Jordan

TL;DR
This paper proposes a new explanation for how learning rate decay improves training of modern neural networks, suggesting it suppresses memorization of noise initially and enhances learning of complex patterns later.
Contribution
It introduces a novel perspective on lrDecay's effectiveness, supported by experiments on synthetic and real datasets, challenging traditional optimization-based explanations.
Findings
Large initial learning rate prevents memorizing noise
Decaying learning rate enhances learning of complex patterns
Later-stage learned patterns are less transferable
Abstract
Learning rate decay (lrDecay) is a \emph{de facto} technique for training modern neural networks. It starts with a large learning rate and then decays it multiple times. It is empirically observed to help both optimization and generalization. Common beliefs in how lrDecay works come from the optimization analysis of (Stochastic) Gradient Descent: 1) an initially large learning rate accelerates training or helps the network escape spurious local minima; 2) decaying the learning rate helps the network converge to a local minimum and avoid oscillation. Despite the popularity of these common beliefs, experiments suggest that they are insufficient in explaining the general effectiveness of lrDecay in training modern neural networks that are deep, wide, and nonconvex. We provide another novel explanation: an initially large learning rate suppresses the network from memorizing noisy data while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and Data Classification · Stochastic Gradient Optimization Techniques
