AutoDrop: Training Deep Learning Models with Automatic Learning Rate Drop
Yunfei Teng, Jing Wang, Anna Choromanska

TL;DR
AutoDrop introduces an automatic learning rate drop method based on the angular velocity of model parameters, improving training speed and generalization without extra hyperparameters.
Contribution
The paper proposes AutoDrop, a novel algorithm that automatically determines when to drop the learning rate based on angular velocity, eliminating manual scheduling and hyperparameter tuning.
Findings
Accelerates deep learning training processes.
Achieves better generalization performance.
Does not require additional hyperparameter tuning.
Abstract
Modern deep learning (DL) architectures are trained using variants of the SGD algorithm that is run with a defined learning rate schedule, i.e., the learning rate is dropped at the pre-defined epochs, typically when the training loss is expected to saturate. In this paper we develop an algorithm that realizes the learning rate drop . The proposed method, that we refer to as AutoDrop, is motivated by the observation that the angular velocity of the model parameters, i.e., the velocity of the changes of the convergence direction, for a fixed learning rate initially increases rapidly and then progresses towards soft saturation. At saturation the optimizer slows down thus the angular velocity saturation is a good indicator for dropping the learning rate. After the drop, the angular velocity "resets" and follows the previously described pattern -…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning
MethodsStochastic Gradient Descent
