Adaptive scaling of the learning rate by second order automatic differentiation
Fr\'ed\'eric de Gournay (IMT, INSA Toulouse), Alban Gossard (IMT, UT3)

TL;DR
This paper introduces a new adaptive learning rate scaling method for deep neural network optimization using second order automatic differentiation to compute curvature, balancing exploration and convergence.
Contribution
It proposes a novel curvature-based rescaling technique that is computationally feasible and interpretable, enhancing optimization by adaptively balancing exploration and convergence.
Findings
Effective in balancing exploration and convergence regimes.
Computational cost is manageable, with a moderate increase over standard methods.
Numerical experiments demonstrate the method's adaptability and performance.
Abstract
In the context of the optimization of Deep Neural Networks, we propose to rescale the learning rate using a new technique of automatic differentiation. This technique relies on the computation of the {\em curvature}, a second order information whose computational complexity is in between the computation of the gradient and the one of the Hessian-vector product. If (1C,1M) represents respectively the computational time and memory footprint of the gradient method, the new technique increase the overall cost to either (1.5C,2M) or (2C,1M). This rescaling has the appealing characteristic of having a natural interpretation, it allows the practitioner to choose between exploration of the parameters set and convergence of the algorithm. The rescaling is adaptive, it depends on the data and on the direction of descent. The numerical experiments highlight the different exploration/convergence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Medical Image Segmentation Techniques · Image and Signal Denoising Methods
