Sharpness-Aware Minimization Revisited: Weighted Sharpness as a   Regularization Term

Yun Yue; Jiadi Jiang; Zhiling Ye; Ning Gao; Yongchao Liu; Ke Zhang

arXiv:2305.15817·cs.LG·December 6, 2024·1 cites

Sharpness-Aware Minimization Revisited: Weighted Sharpness as a Regularization Term

Yun Yue, Jiadi Jiang, Zhiling Ye, Ning Gao, Yongchao Liu, Ke Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces WSAM, a generalized sharpness-aware optimization method that incorporates sharpness as a regularization term, improving the generalization of deep neural networks over existing methods like SAM.

Contribution

The paper proposes WSAM, a novel extension of SAM that uses weighted sharpness as a regularization term and provides theoretical generalization bounds.

Findings

01

WSAM outperforms vanilla optimizer and SAM on multiple datasets.

02

Theoretical analysis confirms improved generalization bounds.

03

WSAM is highly competitive with existing sharpness-aware methods.

Abstract

Deep Neural Networks (DNNs) generalization is known to be closely related to the flatness of minima, leading to the development of Sharpness-Aware Minimization (SAM) for seeking flatter minima and better generalization. In this paper, we revisit the loss of SAM and propose a more general method, called WSAM, by incorporating sharpness as a regularization term. We prove its generalization bound through the combination of PAC and Bayes-PAC techniques, and evaluate its performance on various public datasets. The results demonstrate that WSAM achieves improved generalization, or is at least highly competitive, compared to the vanilla optimizer, SAM and its variants. The code is available at https://github.com/intelligent-machine-learning/atorch/tree/main/atorch/optimizers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

intelligent-machine-learning/atorch
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification

MethodsSegment Anything Model · Sharpness-Aware Minimization