AutoDrop: Training Deep Learning Models with Automatic Learning Rate   Drop

Yunfei Teng; Jing Wang; Anna Choromanska

arXiv:2111.15317·cs.LG·December 14, 2021·1 cites

AutoDrop: Training Deep Learning Models with Automatic Learning Rate Drop

Yunfei Teng, Jing Wang, Anna Choromanska

PDF

Open Access

TL;DR

AutoDrop introduces an automatic learning rate drop method based on the angular velocity of model parameters, improving training speed and generalization without extra hyperparameters.

Contribution

The paper proposes AutoDrop, a novel algorithm that automatically determines when to drop the learning rate based on angular velocity, eliminating manual scheduling and hyperparameter tuning.

Findings

01

Accelerates deep learning training processes.

02

Achieves better generalization performance.

03

Does not require additional hyperparameter tuning.

Abstract

Modern deep learning (DL) architectures are trained using variants of the SGD algorithm that is run with a $manually$ defined learning rate schedule, i.e., the learning rate is dropped at the pre-defined epochs, typically when the training loss is expected to saturate. In this paper we develop an algorithm that realizes the learning rate drop $automatically$ . The proposed method, that we refer to as AutoDrop, is motivated by the observation that the angular velocity of the model parameters, i.e., the velocity of the changes of the convergence direction, for a fixed learning rate initially increases rapidly and then progresses towards soft saturation. At saturation the optimizer slows down thus the angular velocity saturation is a good indicator for dropping the learning rate. After the drop, the angular velocity "resets" and follows the previously described pattern -…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Adversarial Robustness in Machine Learning

MethodsStochastic Gradient Descent