Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization

Tanapat Ratchatorn; Masayuki Tanaka

arXiv:2406.14329·cs.LG·June 21, 2024

Adaptive Adversarial Cross-Entropy Loss for Sharpness-Aware Minimization

Tanapat Ratchatorn, Masayuki Tanaka

PDF

Open Access 1 Repo

TL;DR

This paper introduces the Adaptive Adversarial Cross-Entropy loss to improve Sharpness-Aware Minimization, addressing gradient diminishing issues and enhancing model generalization in image classification tasks.

Contribution

The paper proposes the AACE loss function and a new perturbation method to improve SAM's effectiveness near convergence, a novel approach in loss design for sharpness-aware optimization.

Findings

01

AACE improves model generalization in image classification.

02

Enhanced perturbation consistency near convergence.

03

Empirical results show improved accuracy across datasets.

Abstract

Recent advancements in learning algorithms have demonstrated that the sharpness of the loss surface is an effective measure for improving the generalization gap. Building upon this concept, Sharpness-Aware Minimization (SAM) was proposed to enhance model generalization and achieved state-of-the-art performance. SAM consists of two main steps, the weight perturbation step and the weight updating step. However, the perturbation in SAM is determined by only the gradient of the training loss, or cross-entropy loss. As the model approaches a stationary point, this gradient becomes small and oscillates, leading to inconsistent perturbation directions and also has a chance of diminishing the gradient. Our research introduces an innovative approach to further enhancing model generalization. We propose the Adaptive Adversarial Cross-Entropy (AACE) loss function to replace standard cross-entropy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

T-Ratchatorn/AACE
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning

MethodsSharpness-Aware Minimization · Segment Anything Model