Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization
Tanapat Ratchatorn, Masayuki Tanaka

TL;DR
This paper introduces the Adaptive Adversarial Cross-Entropy loss to improve Sharpness-Aware Minimization, addressing gradient diminishing issues and enhancing model generalization in image classification tasks.
Contribution
The paper proposes the AACE loss function and a new perturbation method to improve SAM's effectiveness near convergence, a novel approach in loss design for sharpness-aware optimization.
Findings
AACE improves model generalization in image classification.
Enhanced perturbation consistency near convergence.
Empirical results show improved accuracy across datasets.
Abstract
Recent advancements in learning algorithms have demonstrated that the sharpness of the loss surface is an effective measure for improving the generalization gap. Building upon this concept, Sharpness-Aware Minimization (SAM) was proposed to enhance model generalization and achieved state-of-the-art performance. SAM consists of two main steps, the weight perturbation step and the weight updating step. However, the perturbation in SAM is determined by only the gradient of the training loss, or cross-entropy loss. As the model approaches a stationary point, this gradient becomes small and oscillates, leading to inconsistent perturbation directions and also has a chance of diminishing the gradient. Our research introduces an innovative approach to further enhancing model generalization. We propose the Adaptive Adversarial Cross-Entropy (AACE) loss function to replace standard cross-entropy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
MethodsSharpness-Aware Minimization · Segment Anything Model
