Enhancing Sharpness-Aware Minimization by Learning Perturbation Radius
Xuehao Wang, Weisen Jiang, Shuai Fu, Yu Zhang

TL;DR
This paper introduces LETS, a bilevel optimization framework that learns the optimal perturbation radius for sharpness-aware minimization, enhancing model generalization across tasks.
Contribution
It proposes a novel bilevel optimization approach to automatically learn the perturbation radius in SAM, adaptable to any SAM variant.
Findings
LETS improves generalization performance on multiple datasets.
The learned perturbation radius outperforms fixed radii in experiments.
LETS is effective across different architectures and tasks.
Abstract
Sharpness-aware minimization (SAM) is to improve model generalization by searching for flat minima in the loss landscape. The SAM update consists of one step for computing the perturbation and the other for computing the update gradient. Within the two steps, the choice of the perturbation radius is crucial to the performance of SAM, but finding an appropriate perturbation radius is challenging. In this paper, we propose a bilevel optimization framework called LEarning the perTurbation radiuS (LETS) to learn the perturbation radius for sharpness-aware minimization algorithms. Specifically, in the proposed LETS method, the upper-level problem aims at seeking a good perturbation radius by minimizing the squared generalization gap between the training and validation losses, while the lower-level problem is the SAM optimization problem. Moreover, the LETS method can be combined with any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques
MethodsSharpness-Aware Minimization · Segment Anything Model
