A Probabilistically Motivated Learning Rate Adaptation for Stochastic Optimization
Filip de Roos, Carl Jidling, Adrian Wills, Thomas Sch\"on, Philipp, Hennig

TL;DR
This paper introduces a probabilistic framework for automatically adapting learning rates in stochastic optimization, improving robustness and reducing manual tuning in deep learning training.
Contribution
It provides a Gaussian inference-based motivation for learning rate adaptation, leading to a meta-algorithm that adjusts learning rates automatically during training.
Findings
Robust adaptation of learning rates across various initial values
Effective in deep learning benchmark tasks
Relates learning rate to a dimensionless, controllable quantity
Abstract
Machine learning practitioners invest significant manual and computational resources in finding suitable learning rates for optimization algorithms. We provide a probabilistic motivation, in terms of Gaussian inference, for popular stochastic first-order methods. As an important special case, it recovers the Polyak step with a general metric. The inference allows us to relate the learning rate to a dimensionless quantity that can be automatically adapted during training by a control algorithm. The resulting meta-algorithm is shown to adapt learning rates in a robust manner across a large range of initial values when applied to deep learning benchmark problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Machine Learning and Data Classification · Stochastic Gradient Optimization Techniques
