Combining learning rate decay and weight decay with complexity gradient descent - Part I
Pierre H. Richemond, Yike Guo

TL;DR
This paper explores the interplay of learning rate decay and weight decay in deep neural networks, introducing the concept of complexity and proposing complexity gradient descent with novel annealing schemes for regularization.
Contribution
It introduces the concept of complexity to analyze regularization effects and proposes complexity gradient descent with new annealing schemes for $L^2$ regularization in deep learning.
Findings
Complexity gradient descent improves training efficiency.
Novel annealing schemes optimize regularization strength.
Insights from physics inform regularization strategies.
Abstract
The role of regularization, in the specific case of deep neural networks rather than more traditional machine learning models, is still not fully elucidated. We hypothesize that this complex interplay is due to the combination of overparameterization and high dimensional phenomena that take place during training and make it unamenable to standard convex optimization methods. Using insights from statistical physics and random fields theory, we introduce a parameter factoring in both the level of the loss function and its remaining nonconvexity: the \emph{complexity}. We proceed to show that it is desirable to proceed with \emph{complexity gradient descent}. We then show how to use this intuition to derive novel and efficient annealing schemes for the strength of regularization when performing standard stochastic gradient descent in deep neural networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Gaussian Processes and Bayesian Inference
