Learning Rate Perturbation: A Generic Plugin of Learning Rate Schedule towards Flatter Local Minima
Hengyu Liu, Qiang Fu, Lun Du, Tiancheng Zhang, Ge Yu, Shi Han and, Dongmei Zhang

TL;DR
This paper introduces LEAP, a generic plugin that perturbs learning rates during training to promote convergence to flatter minima, thereby enhancing model generalization across various schedules and datasets.
Contribution
The paper proposes LEAP, a novel learning rate perturbation method that improves existing schedules by guiding training towards flatter minima with theoretical and empirical support.
Findings
LEAP consistently improves model performance across different datasets.
Training with LEAP favors convergence to flatter minima.
LEAP enhances generalization without modifying existing schedules.
Abstract
Learning rate is one of the most important hyper-parameters that has a significant influence on neural network training. Learning rate schedules are widely used in real practice to adjust the learning rate according to pre-defined schedules for fast convergence and good generalization. However, existing learning rate schedules are all heuristic algorithms and lack theoretical support. Therefore, people usually choose the learning rate schedules through multiple ad-hoc trials, and the obtained learning rate schedules are sub-optimal. To boost the performance of the obtained sub-optimal learning rate schedule, we propose a generic learning rate schedule plugin, called LEArning Rate Perturbation (LEAP), which can be applied to various learning rate schedules to improve the model training by introducing a certain perturbation to the learning rate. We found that, with such a simple yet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Neural Networks and Applications
