SALR: Sharpness-aware Learning Rate Scheduler for Improved   Generalization

Xubo Yue; Maher Nouiehed; Raed Al Kontar

arXiv:2011.05348·cs.LG·July 4, 2023

SALR: Sharpness-aware Learning Rate Scheduler for Improved Generalization

Xubo Yue, Maher Nouiehed, Raed Al Kontar

PDF

TL;DR

SALR is a novel learning rate scheduler that adaptively adjusts the learning rate based on local sharpness, leading to better generalization and faster convergence in deep learning models.

Contribution

We introduce SALR, a sharpness-aware learning rate update method that automatically promotes flat minima, improving generalization and convergence across various models.

Findings

01

SALR enhances model generalization performance.

02

SALR accelerates convergence speed.

03

SALR finds flatter minima leading to better robustness.

Abstract

In an effort to improve generalization in deep learning and automate the process of learning rate scheduling, we propose SALR: a sharpness-aware learning rate update technique designed to recover flat minimizers. Our method dynamically updates the learning rate of gradient-based optimizers based on the local sharpness of the loss function. This allows optimizers to automatically increase learning rates at sharp valleys to increase the chance of escaping them. We demonstrate the effectiveness of SALR when adopted by various algorithms over a broad range of networks. Our experiments indicate that SALR improves generalization, converges faster, and drives solutions to significantly flatter regions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.