Revisiting Learning Rate Control
Micha Henheik, Theresa Eimer, Marius Lindauer

TL;DR
This paper compares various learning rate control methods in deep learning, highlighting their strengths and weaknesses, and emphasizes the need for algorithm selection and new approaches like meta-learning to improve reliability across diverse tasks.
Contribution
It provides a comprehensive comparison of existing learning rate control paradigms and identifies gaps, advocating for algorithm selection and innovative methods like meta-learning.
Findings
Methods from hyperparameter optimization and fixed schedules perform well on some tasks but lack reliability across different settings.
Hyperparameter optimization approaches become less effective as models and tasks increase in complexity.
Algorithm selection and new directions like finetuning and meta-learning are crucial for improving learning rate control.
Abstract
The learning rate is one of the most important hyperparameters in deep learning, and how to control it is an active area within both AutoML and deep learning research. Approaches for learning rate control span from classic optimization to online scheduling based on gradient statistics. This paper compares paradigms to assess the current state of learning rate control. We find that methods from multi-fidelity hyperparameter optimization, fixed-hyperparameter schedules, and hyperparameter-free learning often perform very well on selected deep learning tasks but are not reliable across settings. This highlights the need for algorithm selection methods in learning rate control, which have been neglected so far by both the AutoML and deep learning communities. We also observe a trend of hyperparameter optimization approaches becoming less effective as models and tasks grow in complexity,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
