Sharpness-Aware Minimization with Adaptive Regularization for Training   Deep Neural Networks

Jinping Zou; Xiaoge Deng; and Tao Sun

arXiv:2412.16854·cs.LG·December 24, 2024

Sharpness-Aware Minimization with Adaptive Regularization for Training Deep Neural Networks

Jinping Zou, Xiaoge Deng, and Tao Sun

PDF

Open Access

TL;DR

This paper introduces SAMAR, an adaptive regularization method for sharpness-aware minimization that dynamically adjusts regularization based on model sharpness, improving generalization in deep neural network training.

Contribution

We propose SAMAR, a novel adaptive regularization technique for SAM, with theoretical convergence guarantees and demonstrated effectiveness on image recognition benchmarks.

Findings

01

SAMAR improves accuracy on CIFAR-10 and CIFAR-100

02

SAMAR enhances model generalization

03

Theoretical convergence of SAMAR is established

Abstract

Sharpness-Aware Minimization (SAM) has proven highly effective in improving model generalization in machine learning tasks. However, SAM employs a fixed hyperparameter associated with the regularization to characterize the sharpness of the model. Despite its success, research on adaptive regularization methods based on SAM remains scarce. In this paper, we propose the SAM with Adaptive Regularization (SAMAR), which introduces a flexible sharpness ratio rule to update the regularization parameter dynamically. We provide theoretical proof of the convergence of SAMAR for functions satisfying the Lipschitz continuity. Additionally, experiments on image recognition tasks using CIFAR-10 and CIFAR-100 demonstrate that SAMAR enhances accuracy and model generalization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIndustrial Vision Systems and Defect Detection · Neural Networks and Applications · Face and Expression Recognition

MethodsSegment Anything Model