Training Aware Sigmoidal Optimizer
David Mac\^edo, Pedro Dreyer, Teresa Ludermir, Cleber Zanchettin

TL;DR
The paper introduces TASO, a two-phase automated learning rate schedule for deep neural networks that outperforms common adaptive optimizers in various training scenarios.
Contribution
It proposes a novel two-phase learning rate schedule tailored to the landscape of neural network loss functions, improving training efficiency and performance.
Findings
TASO outperforms Adam, RMSProp, and Adagrad in experiments.
TASO is effective in both hyperparameter-tuned and default settings.
The two-phase approach accelerates training and improves convergence.
Abstract
Proper optimization of deep neural networks is an open research question since an optimal procedure to change the learning rate throughout training is still unknown. Manually defining a learning rate schedule involves troublesome time-consuming try and error procedures to determine hyperparameters such as learning rate decay epochs and learning rate decay rates. Although adaptive learning rate optimizers automatize this process, recent studies suggest they may produce overffiting and reduce performance when compared to fine-tuned learning rate schedules. Considering that deep neural networks loss functions present landscapes with much more saddle points than local minima, we proposed the Training Aware Sigmoidal Optimizer (TASO), which consists of a two-phases automated learning rate schedule. The first phase uses a high learning rate to fast traverse the numerous saddle point, while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Stochastic Gradient Optimization Techniques · Machine Learning and Data Classification
MethodsAttentive Walk-Aggregating Graph Neural Network · Adam · RMSProp
