Sharpness-Aware Minimization with Dynamic Reweighting
Wenxuan Zhou, Fangyu Liu, Huan Zhang, Muhao Chen

TL;DR
This paper introduces delta-SAM, a dynamic reweighting method that approximates per-instance adversarial perturbations in sharpness-aware minimization, leading to improved generalization in neural networks.
Contribution
Delta-SAM provides a theoretically motivated, efficient approach to approximate stronger per-instance adversarial perturbations in SAM, enhancing model generalization.
Findings
Delta-SAM improves generalization on NLP tasks.
Theoretical analysis supports reweighted batch perturbations as effective approximations.
Experiments show delta-SAM outperforms standard SAM in various settings.
Abstract
Deep neural networks are often overparameterized and may not easily achieve model generalization. Adversarial training has shown effectiveness in improving generalization by regularizing the change of loss on top of adversarially chosen perturbations. The recently proposed sharpness-aware minimization (SAM) algorithm conducts adversarial weight perturbation, encouraging the model to converge to a flat minima. SAM finds a common adversarial weight perturbation per-batch. Although per-instance adversarial weight perturbations are stronger adversaries and can potentially lead to better generalization performance, their computational cost is very high and thus it is impossible to use per-instance perturbations efficiently in SAM. In this paper, we tackle this efficiency bottleneck and propose sharpness-aware minimization with dynamic reweighting (delta-SAM). Our theoretical analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis
MethodsSharpness-Aware Minimization
