Systematic Investigation of Sparse Perturbed Sharpness-Aware   Minimization Optimizer

Peng Mi; Li Shen; Tianhe Ren; Yiyi Zhou; Tianshuo Xu; Xiaoshuai Sun,; Tongliang Liu; Rongrong Ji; Dacheng Tao

arXiv:2306.17504·cs.AI·July 3, 2023

Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer

Peng Mi, Li Shen, Tianhe Ren, Yiyi Zhou, Tianshuo Xu, Xiaoshuai Sun,, Tongliang Liu, Rongrong Ji, Dacheng Tao

PDF

Open Access 1 Repo

TL;DR

This paper introduces Sparse SAM (SSAM), an efficient variant of Sharpness-Aware Minimization that applies sparse perturbations to improve training speed and maintain or enhance performance, supported by theoretical convergence guarantees.

Contribution

We propose SSAM, a sparse perturbation scheme for SAM using Fisher information and dynamic sparse training, with proven convergence and improved efficiency.

Findings

01

SSAM reduces computational overhead by 50% while maintaining or improving accuracy.

02

Theoretical proof shows SSAM converges at the same rate as SAM.

03

Experimental results on CIFAR and ImageNet-1K demonstrate superior efficiency and comparable or better performance.

Abstract

Deep neural networks often suffer from poor generalization due to complex and non-convex loss landscapes. Sharpness-Aware Minimization (SAM) is a popular solution that smooths the loss landscape by minimizing the maximized change of training loss when adding a perturbation to the weight. However, indiscriminate perturbation of SAM on all parameters is suboptimal and results in excessive computation, double the overhead of common optimizers like Stochastic Gradient Descent (SGD). In this paper, we propose Sparse SAM (SSAM), an efficient and effective training scheme that achieves sparse perturbation by a binary mask. To obtain the sparse mask, we provide two solutions based on Fisher information and dynamic sparse training, respectively. We investigate the impact of different masks, including unstructured, structured, and $N$ : $M$ structured patterns, as well as explicit and implicit…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mi-peng/systematic-investigation-of-sparse-perturbed-sharpness-aware-minimization-optimizer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning

MethodsSegment Anything Model · Sharpness-Aware Minimization