Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer
Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Tianshuo Xu, Xiaoshuai Sun,, Tongliang Liu, Rongrong Ji, Dacheng Tao

TL;DR
This paper introduces Sparse SAM (SSAM), an efficient variant of Sharpness-Aware Minimization that applies sparse perturbations to improve training speed and maintain or enhance performance, supported by theoretical convergence guarantees.
Contribution
We propose SSAM, a sparse perturbation scheme for SAM using Fisher information and dynamic sparse training, with proven convergence and improved efficiency.
Findings
SSAM reduces computational overhead by 50% while maintaining or improving accuracy.
Theoretical proof shows SSAM converges at the same rate as SAM.
Experimental results on CIFAR and ImageNet-1K demonstrate superior efficiency and comparable or better performance.
Abstract
Deep neural networks often suffer from poor generalization due to complex and non-convex loss landscapes. Sharpness-Aware Minimization (SAM) is a popular solution that smooths the loss landscape by minimizing the maximized change of training loss when adding a perturbation to the weight. However, indiscriminate perturbation of SAM on all parameters is suboptimal and results in excessive computation, double the overhead of common optimizers like Stochastic Gradient Descent (SGD). In this paper, we propose Sparse SAM (SSAM), an efficient and effective training scheme that achieves sparse perturbation by a binary mask. To obtain the sparse mask, we provide two solutions based on Fisher information and dynamic sparse training, respectively. We investigate the impact of different masks, including unstructured, structured, and : structured patterns, as well as explicit and implicit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning
MethodsSegment Anything Model · Sharpness-Aware Minimization
